microsoft · romanlutz · May 21, 2026 · May 18, 2026 · May 18, 2026 · May 18, 2026
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -69,16 +69,6 @@ repos:
       name: Ruff (Jupyter Notebooks)
       args: [--fix]
 
-  - repo: local
-    hooks:
-    - id: check-links
-      name: Check Links in Python and md Files
-      entry: python ./build_scripts/check_links.py
-      language: python
-      files: ^doc.*\.(py|md)$
-      additional_dependencies: ['requests']
-      exclude: (release_process.md|git.md|^doc/deployment/|tests|pyrit/prompt_converter/morse_converter.py|.github|pyrit/prompt_converter/emoji_converter.py|pyrit/score/markdown_injection.py|^pyrit/datasets/|^pyrit/auxiliary_attacks/gcg/)
-
   - repo: https://github.com/allganize/ty-pre-commit
     rev: v0.0.32
     hooks:

diff --git a/Makefile b/Makefile
@@ -25,7 +25,8 @@ ty:
 docs-build:
 	uv run python build_scripts/pydoc2json.py pyrit --submodules -o doc/_api/pyrit_all.json
 	uv run python build_scripts/gen_api_md.py
-	cd doc && uv run jupyter-book build --all --html
+	# --strict validates URLs and cross-refs; skips are configured in doc/myst.yml under error_rules
+	cd doc && uv run jupyter-book build --all --html --strict
 	uv run ./build_scripts/generate_rss.py
 
 # Build the full documentation site including the PDF export.
@@ -36,7 +37,8 @@ docs-build:
 docs-build-all:
 	uv run python build_scripts/pydoc2json.py pyrit --submodules -o doc/_api/pyrit_all.json
 	uv run python build_scripts/gen_api_md.py
-	cd doc && uv run jupyter-book build --all --html --pdf
+	# --strict validates URLs and cross-refs; skips are configured in doc/myst.yml under error_rules
+	cd doc && uv run jupyter-book build --all --html --pdf --strict
 	uv run ./build_scripts/generate_rss.py
 
 # Regenerate only the API reference pages (without building the full site)

diff --git a/build_scripts/check_links.py b/build_scripts/check_links.py
diff --git a/doc/blog/2025_01_27.md b/doc/blog/2025_01_27.md
@@ -78,11 +78,11 @@ Finally, when PyRIT gets a response from the Target LLM, it switches to another
 
 When examining this request, you may discover that occasionally the Adversarial LLM struggles with generating the right JSON format, leading to an error in PyRIT, regardless of whether the objective was achieved or not. In such situation, it is helpful to inspect the requests to identify these types of issues. Specifically, I found a problem when the LLM response contained double quotes, causing issues with subsequent JSON formats which was fixed using the "SearchReplaceConverter"[^9] prompt converter.
 
-[^7]: "Multi-Turn Attack - RedTeamingAttack Example", https://microsoft.github.io/PyRIT/code/executor/attack/2_red_teaming_attack.html
+[^7]: "Multi-Turn Attack - RedTeamingAttack Example", ../code/executor/attack/2_red_teaming_attack.ipynb
 
-[^8]: "PyRIT - SearchReplaceConverter", https://microsoft.github.io/PyRIT/_autosummary/pyrit.prompt_converter.SearchReplaceConverter.html
+[^8]: "PyRIT - SearchReplaceConverter", ../api/pyrit_prompt_converter.md#searchreplaceconverter
 
-[^9]: "PyRIT - True False Scoring", https://microsoft.github.io/PyRIT/code/scoring/2_true_false_scorers.html#true-false-scoring
+[^9]: "PyRIT - True False Scoring", ../code/scoring/2_true_false_scorers.ipynb#true-false-scoring
 
 ### Final Thoughts
 

diff --git a/doc/blog/2025_02_11.md b/doc/blog/2025_02_11.md
@@ -32,6 +32,6 @@ See the updated documentation [here](../code/datasets/1_loading_datasets.ipynb).
 
 ## What else can we do with this?
 
-Now that we've loaded our dataset into PyRIT as a `SeedPromptDataset` the really exciting red teaming can begin. A great example of this is in our [Baseline-Only Execution](../code/scenarios/9_baseline_only.ipynb) notebook! We can use the prompts to evaluate the target by sending all the previously loaded prompts, modifying which attacks to use, and storing the scores for further analysis.
+Now that we've loaded our dataset into PyRIT as a `SeedPromptDataset` the really exciting red teaming can begin. A great example of this is the [Baseline Execution](../code/scenarios/0_scenarios.ipynb#baseline-execution) section of our scenarios overview! We can use the prompts to evaluate the target by sending all the previously loaded prompts, modifying which attacks to use, and storing the scores for further analysis.
 
 In this blog post, we've walked through how we use structured datasets through our `SeedPrompt` and `SeedPromptDataset` classes. PyRIT's architecture allows for customization at every stage - whether through converters or configuring different scorers - and seed prompts set us up to effectively probe for risks in AI systems. Send over any contributions to add more datasets, refine seed prompts, or any open issues on Github! Now that you understand a core component of PyRIT, go ahead and try it out!
diff --git a/doc/blog/2025_06_06.md b/doc/blog/2025_06_06.md
@@ -12,7 +12,7 @@ The [AI Recruiter](https://github.com/KutalVolkan/ai_recruiter) is designed to m
 
 - Résumé Processing & Semantic Matching: Résumés are extracted from PDFs, with embeddings generated using models like text-embedding-ada-002. These embeddings enable semantic matching, while GPT-4o is later used to assign a match score based on relevance and extracted content.
 
-- Automated RAG Vulnerability Testing: Attackers can manipulate résumé content by injecting hidden text (via a [PDF converter](https://github.com/microsoft/PyRIT/blob/main/doc/code/converters/pdf_converter.ipynb)) that optimizes scoring, influencing the AI Recruiter’s ranking system.
+- Automated RAG Vulnerability Testing: Attackers can manipulate résumé content by injecting hidden text (via a [PDF converter](../code/converters/5_file_converters.ipynb#pdfconverter)) that optimizes scoring, influencing the AI Recruiter’s ranking system.
 
 - [XPIA Attack](https://github.com/microsoft/PyRIT/blob/main/doc/code/executor/workflow/2_xpia_ai_recruiter.ipynb) Integration: PyRIT enables full automation of prompt injections, making AI vulnerability research efficient and reproducible.
 ---

diff --git a/doc/blog/2026_04_14_scoring_scorers.md b/doc/blog/2026_04_14_scoring_scorers.md
@@ -108,7 +108,7 @@ flowchart TB
 
 There are a few different ways to view metrics for specific scoring configurations.
 
-**Directly on a scorer instance:** Call `get_scorer_metrics()` on any scorer object to look up its saved metrics (if they exist), as described at the bottom of the [Scorer Evaluation Identifier](#scorer-evaluation-identifier) section above. See the [scorer metrics notebook](../code/scoring/8_scorer_metrics.ipynb) to try it yourself!
+**Directly on a scorer instance:** Call `get_scorer_metrics()` on any scorer object to look up its saved metrics (if they exist), as described at the bottom of the [Scorer Evaluation Identifier](#scorer-evaluation-identifier) section above. See the [scorer metrics notebook](../code/scoring/7_scorer_metrics.ipynb) to try it yourself!
 
 **Automatically in scenario output:** When running scenarios and printing results (i.e., in [pyrit_scan](../scanner/1_pyrit_scan.ipynb) or [pyrit_shell](../scanner/2_pyrit_shell.md)), metrics are automatically fetched and displayed alongside the attack results (as long as the scoring configuration has been evaluated before):
 
@@ -132,7 +132,7 @@ The framework checks the JSONL registry for an existing entry matching the score
 
 ![alt text](2026_04_14_running_evaluation.png)
 
-For the full walkthrough — including running objective and harm evaluations, configuring custom datasets, and comparing results — give the [scorer metrics notebook](../code/scoring/8_scorer_metrics.ipynb) a try!
+For the full walkthrough — including running objective and harm evaluations, configuring custom datasets, and comparing results — give the [scorer metrics notebook](../code/scoring/7_scorer_metrics.ipynb) a try!
 
 ## Closing Thoughts
 

diff --git a/doc/code/converters/1_text_to_text_converters.ipynb b/doc/code/converters/1_text_to_text_converters.ipynb
@@ -22,7 +22,7 @@
    "id": "1",
    "metadata": {},
    "source": [
-    "<a id=\"non-llm-converters\"></a>\n",
+    "(non-llm-converters)=\n",
     "## Non-LLM Converters\n",
     "\n",
     "Non-LLM converters use deterministic algorithms to transform text. These include:\n",
@@ -454,7 +454,7 @@
    "id": "10",
    "metadata": {},
    "source": [
-    "<a id=\"llm-based-converters\"></a>\n",
+    "(llm-based-converters)=\n",
     "## LLM-Based Converters\n",
     "\n",
     "LLM-based converters use language models to transform prompts. These converters are more flexible and can produce more natural variations, but they are slower and require an LLM target.\n",

diff --git a/doc/code/converters/1_text_to_text_converters.py b/doc/code/converters/1_text_to_text_converters.py
@@ -22,7 +22,7 @@
 # - **[LLM-Based Converters](#llm-based-converters)**: AI-powered transformations including translation, variation, and semantic modifications
 
 # %% [markdown]
-# <a id="non-llm-converters"></a>
+# (non-llm-converters)=
 # ## Non-LLM Converters
 #
 # Non-LLM converters use deterministic algorithms to transform text. These include:
@@ -225,7 +225,7 @@
 print("Variation Selector:", await var_selector.convert_async(prompt=prompt))  # type: ignore
 
 # %% [markdown]
-# <a id="llm-based-converters"></a>
+# (llm-based-converters)=
 # ## LLM-Based Converters
 #
 # LLM-based converters use language models to transform prompts. These converters are more flexible and can produce more natural variations, but they are slower and require an LLM target.

diff --git a/doc/code/converters/2_audio_converters.ipynb b/doc/code/converters/2_audio_converters.ipynb
@@ -23,7 +23,7 @@
    "id": "1",
    "metadata": {},
    "source": [
-    "<a id=\"text-to-audio\"></a>\n",
+    "(text-to-audio)=\n",
     "## Text to Audio\n",
     "\n",
     "The `AzureSpeechTextToAudioConverter` converts text input into audio output, generating spoken audio files."
@@ -72,7 +72,7 @@
    "id": "3",
    "metadata": {},
    "source": [
-    "<a id=\"audio-to-text\"></a>\n",
+    "(audio-to-text)=\n",
     "## Audio to Text\n",
     "\n",
     "The `AzureSpeechAudioToTextConverter` transcribes audio files into text. Below we use the audio file created in the previous section."
@@ -117,7 +117,7 @@
    "id": "5",
    "metadata": {},
    "source": [
-    "<a id=\"audio-to-audio\"></a>\n",
+    "(audio-to-audio)=\n",
     "## Audio to Audio\n",
     "\n",
     "Audio-to-audio converters modify existing audio files. All of these converters accept `audio_path` input\n",
@@ -240,7 +240,8 @@
  ],
  "metadata": {
   "jupytext": {
-   "cell_metadata_filter": "-all"
+   "cell_metadata_filter": "-all",
+   "main_language": "python"
   },
   "language_info": {
    "codemirror_mode": {

diff --git a/doc/code/converters/2_audio_converters.py b/doc/code/converters/2_audio_converters.py
@@ -23,7 +23,7 @@
 # - **[Audio to Audio](#audio-to-audio)**: Modify audio files (speed, volume, echo, frequency, noise)
 
 # %% [markdown]
-# <a id="text-to-audio"></a>
+# (text-to-audio)=
 # ## Text to Audio
 #
 # The `AzureSpeechTextToAudioConverter` converts text input into audio output, generating spoken audio files.
@@ -45,7 +45,7 @@
 assert os.path.exists(audio_convert_result.output_text)
 
 # %% [markdown]
-# <a id="audio-to-text"></a>
+# (audio-to-text)=
 # ## Audio to Text
 #
 # The `AzureSpeechAudioToTextConverter` transcribes audio files into text. Below we use the audio file created in the previous section.
@@ -70,7 +70,7 @@
 print(transcript)
 
 # %% [markdown]
-# <a id="audio-to-audio"></a>
+# (audio-to-audio)=
 # ## Audio to Audio
 #
 # Audio-to-audio converters modify existing audio files. All of these converters accept `audio_path` input