diff --git a/.gitignore b/.gitignore
index bc177ec..213227a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -5,6 +5,8 @@ __pycache__/
 .venv/
 venv/
 env/
+.conda/
+.mplconfig/
 .ipynb_checkpoints/
 
 # OS / IDE
@@ -18,7 +20,9 @@ datasets/
 Cityscapes/
 cityscapes/
 Anomaly_Validation_Datasets/
+Validation_Dataset
 *.zip
+Miniconda3-*.sh
 
 # Models / checkpoints
 checkpoints/
diff --git a/README.md b/README.md
index d6919ee..7463e88 100644
--- a/README.md
+++ b/README.md
@@ -1,7 +1,7 @@
-# OutlierDrive: Open-World Road Anomaly Segmentation
+# OutlierDrive:  Road Anomaly Segmentation
 
-OutlierDrive is a research-oriented computer vision project focused on anomaly segmentation for autonomous driving scenes.  
-The project compares pixel-based and mask-based segmentation models for detecting unknown or out-of-distribution objects in road environments.
+OutlierDrive is a research oriented computer vision project focused on anomaly segmentation for autonomous driving scenes.  
+The project compares pixel based and mask based segmentation models for detecting unknown objects in road environments.
 
 ## Goals
 
@@ -44,6 +44,3 @@ Main branches:
 - `feature/eomt-mask-baselines`
 - `feature/finetuning-report`
 
-## Repository Status
-
-This repository is under active development as part of a graduate-level computer vision project at Politecnico di Torino.
\ No newline at end of file
diff --git a/REPORT_DRAFT.tex b/REPORT_DRAFT.tex
new file mode 100644
index 0000000..7af6335
--- /dev/null
+++ b/REPORT_DRAFT.tex
@@ -0,0 +1,137 @@
+\title{Comprehensive Road Scene Understanding for Autonomous Driving}
+
+\author{%
+Group XX \\
+Name Surname, Name Surname, Name Surname, Name Surname \\
+Politecnico di Torino
+}
+
+\maketitle
+
+\begin{abstract}
+This project studies road-scene understanding for autonomous driving, moving from closed-set semantic segmentation to open-world anomaly segmentation. We compare EoMT checkpoints trained on COCO, Cityscapes, and a fine-tuned Cityscapes setup, and evaluate post-hoc anomaly scoring methods on the SegmentMeIfYouCan, Fishyscapes, and Road Anomaly validation datasets. For semantic segmentation, the Cityscapes-trained EoMT reaches 81.68\% mIoU on all 19 Cityscapes classes, while the COCO-trained model reaches 62.86\% mIoU on the mapped Cityscapes overlap classes. For anomaly segmentation, the best EoMT result is obtained with the Cityscapes checkpoint and entropy scoring on RoadObstacle21, reaching 94.28 AuPRC and 0.35 FPR95. Temperature scaling was evaluated for MSP with $T \in \{0.5,0.75,1.0,1.1\}$; it produced only small changes, with $T=1.1$ giving the best average MSP performance in most checkpoints. Code and full result CSVs are available at \url{https://github.com/OutlierDrive-Lab/outlierdrive}.
+\end{abstract}
+
+\section{Introduction}
+Autonomous driving perception systems must understand road scenes at pixel level. Semantic segmentation assigns a class label to each pixel, while instance and panoptic segmentation additionally distinguish object instances. These tasks work well when test images follow the training distribution, but real driving scenes may contain rare or unknown objects that are not present during training. This motivates anomaly segmentation, where the objective is to detect out-of-distribution objects in road scenes.
+
+The project follows this progression. We first studied standard semantic and panoptic segmentation models, then compared two EoMT checkpoints trained on different label spaces, and finally evaluated post-hoc anomaly scoring methods. The final focus of our implementation is mask-based anomaly segmentation with EoMT, evaluated on the same anomaly validation datasets used for the pixel-based ERFNet baselines.
+
+\section{Methodology}
+\subsection{Semantic Segmentation Evaluation}
+We evaluated EoMT on Cityscapes validation data. For the Cityscapes-trained checkpoint, predictions are already expressed in the 19 Cityscapes trainId classes. For the COCO-trained checkpoint, the output class space is different, so predictions were mapped to the Cityscapes classes that overlap with COCO. This makes the COCO comparison meaningful, but it is not a full 19-class Cityscapes evaluation because classes such as pole, terrain, and rider are not covered by the mapping.
+
+The semantic metric is mean Intersection over Union (mIoU). The confusion matrix is accumulated over validation pixels, ignoring label 255. For class $c$, IoU is
+\[
+\mathrm{IoU}_c = \frac{\mathrm{TP}_c}{\mathrm{TP}_c + \mathrm{FP}_c + \mathrm{FN}_c},
+\]
+and mIoU is the average across the evaluated classes.
+
+\subsection{EoMT Mask-Based Anomaly Pipeline}
+For anomaly segmentation, we use the EoMT semantic inference path: each image is processed with sliding-window inference, the model returns mask logits and class logits, and these are converted to per-pixel semantic logits using EoMT's semantic aggregation helper. The anomaly datasets are read directly from their image and binary mask folders rather than through the Cityscapes datamodule.
+
+We evaluated three checkpoints:
+\begin{itemize}
+    \item COCO-trained EoMT,
+    \item Cityscapes-trained EoMT,
+    \item fine-tuned EoMT.
+\end{itemize}
+Each checkpoint was evaluated on five anomaly datasets: RoadAnomaly21, RoadObstacle21, RoadAnomaly, Fishyscapes Static, and FS Lost \& Found.
+
+\subsection{Anomaly Scores}
+We compared four post-hoc anomaly scores. MSP uses the confidence of the most likely class:
+\[
+s_{\mathrm{MSP}}(x) = 1 - \max_c p(c \mid x).
+\]
+MaxLogit uses the negative maximum logit, so lower classification evidence gives a higher anomaly score. Entropy measures predictive uncertainty:
+\[
+s_{\mathrm{Ent}}(x) = -\sum_c p(c \mid x)\log p(c \mid x).
+\]
+Finally, the RbA-style score uses the mask-query outputs before final semantic aggregation. It measures how strongly a pixel is rejected by all known query/class predictions, which is more specific to a mask-based architecture than MSP or MaxLogit.
+
+\subsection{Temperature Scaling}
+The project specification also asks for temperature scaling. We applied temperature scaling to MSP by replacing the softmax with
+\[
+p_T(c \mid x) = \mathrm{softmax}(z_c/T).
+\]
+We evaluated $T=0.5, 0.75, 1.0,$ and $1.1$ for all three EoMT checkpoints and all five anomaly datasets. The implementation computes all temperatures from the same logits, so the model forward pass is not repeated unnecessarily.
+
+\section{Experimental Results}
+\subsection{Semantic Segmentation}
+Table~\ref{tab:semantic} reports the semantic segmentation numbers used to contextualize the EoMT checkpoints. The all-19-class result is only reported for the Cityscapes class space. The COCO result is reported on the mapped overlap classes, so it should not be interpreted as a full 19-class Cityscapes score.
+
+\begin{table}[t]
+\centering
+\small
+\begin{tabular}{lcc}
+\hline
+Evaluation & mIoU (\%) & Pixel Acc. (\%) \\
+\hline
+Cityscapes, all 19 classes & 81.68 & 96.72 \\
+Cityscapes checkpoint, overlap classes & 84.78 & 97.13 \\
+COCO checkpoint, mapped overlap classes & 62.86 & 90.68 \\
+\hline
+\end{tabular}
+\caption{Semantic segmentation evaluation on Cityscapes. The COCO number uses only mapped overlap classes.}
+\label{tab:semantic}
+\end{table}
+
+\subsection{Anomaly Segmentation}
+The anomaly benchmark contains 60 rows: 3 checkpoints, 5 datasets, and 4 scoring methods. Table~\ref{tab:anomaly-average} summarizes the average behavior over the five datasets. The Cityscapes checkpoint is the most stable overall. Its entropy score gives the best mean AuPRC, while all Cityscapes-based scores are close to each other. The COCO checkpoint performs poorly on most anomaly datasets, which is expected because its class space and training data are less aligned with road scenes.
+
+\begin{table}[t]
+\centering
+\small
+\begin{tabular}{llcc}
+\hline
+Checkpoint & Score & Mean AuPRC & Mean FPR95 \\
+\hline
+COCO & MSP & 13.75 & 93.21 \\
+COCO & Entropy & 16.25 & 89.53 \\
+Cityscapes & MSP & 61.64 & 20.54 \\
+Cityscapes & Entropy & 62.55 & 20.46 \\
+Fine-tuned & MSP & 50.55 & 27.93 \\
+Fine-tuned & Entropy & 50.20 & 29.41 \\
+\hline
+\end{tabular}
+\caption{Unweighted mean anomaly performance over the five validation datasets. Higher AuPRC is better; lower FPR95 is better. Full per-dataset results are included in the repository CSV files.}
+\label{tab:anomaly-average}
+\end{table}
+
+The strongest individual anomaly result is on RoadObstacle21, where the Cityscapes checkpoint with entropy reaches 94.28 AuPRC and 0.35 FPR95. The same checkpoint is also strong on RoadAnomaly, where entropy reaches 74.19 AuPRC and 14.69 FPR95. For RoadAnomaly21 and Fishyscapes Static, the fine-tuned checkpoint performs best, reaching 70.77 AuPRC with MSP on RoadAnomaly21 and 71.60 AuPRC with entropy on Fishyscapes Static.
+
+\subsection{Temperature Scaling}
+Table~\ref{tab:temperature} reports the average MSP behavior with temperature scaling. The effect is small. For the Cityscapes checkpoint, $T=1.1$ gives the best average AuPRC, but the difference from $T=1.0$ is minor. This suggests that the ranking induced by MSP is already quite stable for these logits, and temperature scaling alone is not enough to substantially change anomaly segmentation performance.
+
+\begin{table}[t]
+\centering
+\small
+\begin{tabular}{lccc}
+\hline
+Checkpoint & Best $T$ & Mean AuPRC & Mean FPR95 \\
+\hline
+COCO & 1.1 & 13.76 & 93.20 \\
+Cityscapes & 1.1 & 61.66 & 20.54 \\
+Fine-tuned & 1.1 & 50.55 & 27.93 \\
+\hline
+\end{tabular}
+\caption{Best average MSP temperature per checkpoint over the five anomaly datasets.}
+\label{tab:temperature}
+\end{table}
+
+\section{Discussion}
+The results show that training domain and label space strongly affect anomaly segmentation. The COCO checkpoint is useful for broad visual recognition, but its panoptic class space does not align well with road-scene anomaly detection. The Cityscapes checkpoint gives the best overall anomaly results because it has learned a road-scene representation closer to the validation data.
+
+The fine-tuned checkpoint improves some datasets but is not uniformly better. This is consistent with the semantic segmentation discussion: fine-tuning with limited resources can improve domain adaptation, but it does not guarantee a stronger model on every metric or dataset. In our anomaly results, fine-tuning helps RoadAnomaly21 and Fishyscapes Static, while the original Cityscapes checkpoint remains stronger on RoadObstacle21 and RoadAnomaly.
+
+Among post-hoc methods, entropy is generally competitive because it captures uncertainty across all classes rather than only the top class. MSP and MaxLogit are simpler and often close to entropy, but they can fail when the model is confidently wrong. The RbA-style score is conceptually better matched to a mask-based model because it uses query-level mask and class outputs, although in our results it does not consistently dominate the simpler uncertainty scores.
+
+Temperature scaling was included as an additional baseline. The results show only small changes across $T=0.5,0.75,1.0,1.1$. This means it is useful to report, but it should not be presented as the main source of improvement. A more substantial improvement would likely require training-time changes or a stronger anomaly-specific scoring method.
+
+\section{Conclusion}
+We implemented and evaluated a reproducible EoMT mask-based anomaly segmentation pipeline. The best anomaly result was obtained by the Cityscapes checkpoint with entropy scoring on RoadObstacle21, reaching 94.28 AuPRC and 0.35 FPR95. Across datasets, the Cityscapes checkpoint was the most reliable, while the fine-tuned checkpoint improved selected datasets but was not uniformly superior. Temperature scaling completed the required baseline and showed that MSP is only weakly affected by the tested temperatures.
+
+{\small
+\bibliographystyle{ieeenat_fullname}
+\bibliography{references}
+}
diff --git a/REPORT_NOTES.md b/REPORT_NOTES.md
new file mode 100644
index 0000000..34cc921
--- /dev/null
+++ b/REPORT_NOTES.md
@@ -0,0 +1,110 @@
+# Report Notes and Presentation Points
+
+## Submission Constraints From The Project PDF
+
+- Use the CVPR LaTeX template.
+- Maximum 5 pages, excluding references.
+- Include the public GitHub repository link at the end of the abstract.
+- The report should be self-contained: introduce any referenced method before discussing it.
+- Required structure:
+  - Abstract
+  - Introduction
+  - Methodology
+  - Experimental Results
+  - Discussion
+  - Conclusion
+  - References
+
+## Main Numbers To Report
+
+Semantic segmentation:
+
+- Cityscapes-trained EoMT on all 19 Cityscapes classes:
+  - mIoU: 81.68%
+  - Pixel accuracy: 96.72%
+- COCO-trained EoMT mapped to Cityscapes overlap classes:
+  - mIoU: 62.86%
+  - Pixel accuracy: 90.68%
+- Cityscapes-trained EoMT on the same overlap classes:
+  - mIoU: 84.78%
+  - Pixel accuracy: 97.13%
+
+Important distinction:
+
+- Do not say the COCO model reaches 62.86% on all 19 classes. It is overlap classes only.
+- Do not say the fine-tuned model reaches 81.68% unless you have a separate CSV proving that exact fine-tuned checkpoint produced it.
+
+Step 8 anomaly segmentation:
+
+- `eomt_anomaly_results.csv`: 60 rows.
+- `eomt_temperature_results.csv`: 60 rows.
+- `eomt_all_results.csv`: 120 rows.
+
+Best individual anomaly result:
+
+- Checkpoint: `eomt_cityscapes`
+- Dataset: `RoadObsticle21` / RoadObstacle21
+- Method: Entropy
+- AuPRC: 94.28
+- FPR95: 0.35
+
+Temperature scaling:
+
+- Tested MSP with T = 0.5, 0.75, 1.0, 1.1.
+- Best average temperature was T = 1.1 for all three checkpoints, but the improvement over T = 1.0 is very small.
+- Explain this as a required baseline, not as a major improvement.
+
+## How To Explain The Code
+
+`run_eomt_anomaly.py`:
+
+1. Loads the EoMT config and checkpoint.
+2. Infers image size, number of classes, and query count from the checkpoint when possible.
+3. Reads anomaly dataset images directly from `Validation_Dataset/<dataset>/images`.
+4. Finds ground-truth masks in `labels_masks`.
+5. Runs EoMT sliding-window semantic inference.
+6. Converts mask logits and class logits into per-pixel logits.
+7. Computes MSP, MaxLogit, Entropy, and RbA-style anomaly scores.
+8. Collects all valid pixels, ignoring label 255.
+9. Computes AuPRC and FPR95.
+10. Writes one CSV row per checkpoint/dataset/method.
+
+Temperature scaling:
+
+- The script supports `--temperatures`.
+- It computes the model logits once per image.
+- It then recomputes MSP for each temperature from the same logits.
+- This avoids repeating the expensive model forward pass.
+
+`compute_cityscapes_miou.py`:
+
+- This is only for semantic mIoU when prediction PNG masks already exist.
+- It is separate from anomaly segmentation.
+- It computes the confusion matrix over Cityscapes trainIds and returns mIoU and pixel accuracy.
+
+## Suggested 5-Page Allocation
+
+- Abstract: 1 paragraph.
+- Introduction: half page.
+- Methodology: 1.25 pages.
+- Experimental Results: 1.5 pages.
+- Discussion: 1 page.
+- Conclusion: short paragraph.
+- References: excluded from 5-page limit.
+
+## What To Emphasize In Discussion
+
+- Cityscapes-trained EoMT is strongest overall for anomaly segmentation because its training domain matches road scenes.
+- COCO-trained EoMT is weaker because its class space and data distribution are not aligned with Cityscapes/anomaly road scenes.
+- Fine-tuning helps some datasets but not all, so it should be discussed as dataset-dependent rather than universally better.
+- Entropy works well because it captures uncertainty over the full class distribution.
+- Temperature scaling changes the confidence calibration but does not strongly change the ranking of anomaly pixels in these results.
+
+## Files To Cite In The Report
+
+- `step4_eomt_eval/iou_results.csv`
+- `step4_eomt_eval/coco_trained_overlap_iou.csv`
+- `step4_eomt_eval/cityscapes_trained_overlap_iou.csv`
+- `step8_eomt_mask_baselines/eomt_anomaly_results.csv`
+- `step8_eomt_mask_baselines/eomt_temperature_results.csv`
+- `step8_eomt_mask_baselines/eomt_all_results.csv`
diff --git a/STEP8_EOMT_MASK_BASELINES_PLAN.md b/STEP8_EOMT_MASK_BASELINES_PLAN.md
new file mode 100644
index 0000000..0869681
--- /dev/null
+++ b/STEP8_EOMT_MASK_BASELINES_PLAN.md
@@ -0,0 +1,666 @@
+# Step 8 EoMT Mask-Based Anomaly Baselines Plan
+
+This document describes the Step 8 evaluation plan for EoMT mask-based anomaly segmentation baselines on the same anomaly validation datasets used in Step 7.
+
+## Goal
+
+Complete the "Mask-based baselines" section from the project PDF.
+
+The final result should be a reproducible evaluation pipeline that:
+
+- Runs EoMT on the anomaly segmentation validation datasets.
+- Computes anomaly scores using MSP, MaxLogit, Max Entropy, and RbA.
+- Evaluates AuPRC and FPR@95 for each dataset and method.
+- Repeats the evaluation for three EoMT checkpoints:
+  - COCO-trained EoMT.
+  - Cityscapes-trained EoMT.
+  - Fine-tuned EoMT from Step 5.
+- Reports mIoU for each checkpoint.
+- Adds temperature scaling experiments for MSP if time allows.
+
+## Existing Repo Context
+
+Important files already in this repository:
+
+- `eval/evalAnomaly.py`: Original ERFNet anomaly evaluation script. Use its dataset loop, ground-truth path handling, mask normalization, and metric logic as a reference.
+- `eval/README.md`: Explains the anomaly datasets and the original eval command.
+- `eomt/README.md`: Explains how to install EoMT requirements, load checkpoints, and run validation.
+- `step4_eomt_eval/test_eval_pipeline/predictions_eomt_city.ipynb`: Shows Cityscapes EoMT inference and prediction export.
+- `step4_eomt_eval/test_eval_pipeline/predictions_eomt_coco.ipynb`: Shows COCO EoMT inference and prediction export.
+- `step4_eomt_eval/eomt_eval_iou.py`: Scripted Cityscapes mIoU evaluation on all 19 trainId classes.
+- `step4_eomt_eval/eomt_eval_overlap_iou.py`: Scripted COCO-to-Cityscapes overlap evaluation with class mapping.
+- `eomt/training/lightning_module.py`: Contains key EoMT inference helpers:
+  - `window_imgs_semantic`
+  - `revert_window_logits_semantic`
+  - `to_per_pixel_logits_semantic`
+- `eomt/configs/dinov2/coco/panoptic/eomt_base_640_2x.yaml`: COCO panoptic EoMT config currently present on this branch.
+
+Important branch policy:
+
+- Do not merge, rebase, or cherry-pick colleague branches just to complete Step 8.
+- Step 8 should be self-contained on this branch.
+- Other branches may be inspected as read-only references only.
+- If a useful implementation exists on another branch, reimplement the small needed utility in the Step 8 folder or copy only a tiny, reviewed, self-contained snippet with attribution in comments if appropriate.
+- The old Step 7 implementation exists on `origin/feature/erfnet-baselines` under `step7_erfnet_pixel_baselines/`; treat it as a reference template only, not a dependency.
+- That Step 7 runner is useful because it shows MSP, MaxLogit, Max Entropy, CSV output, binary anomaly masks, AuPRC, and FPR@95.
+- The current branch may not contain the Cityscapes EoMT config. If missing, add the needed config file directly to this branch from official/project materials or reconstruct it from the known EoMT config pattern. Do not merge another branch just for the config:
+  - `eomt/configs/dinov2/cityscapes/semantic/eomt_base_640.yaml`
+  - It was present on the data-evaluation work.
+
+## Required Inputs
+
+These inputs are required locally to run Step 8, but they should not be committed to git.
+
+The repository currently does not include the anomaly dataset archive or EoMT checkpoint files. This is expected: `.gitignore` excludes dataset folders, zip files, model weights, checkpoints, `.bin` files, and cached logits. The implementation should accept paths to these local files through CLI arguments.
+
+## Local Execution Feasibility
+
+Step 8 should be implemented as normal Python scripts that can run from the terminal. JupyterLab is optional for visualization/debugging, but the final pipeline should not require notebooks.
+
+The repository does not track datasets or checkpoint files. A local environment must provide:
+
+- Python with the EoMT dependencies installed.
+- The anomaly validation datasets.
+- The EoMT checkpoint files.
+- The matching EoMT configuration files.
+
+Without those local inputs, the scripts can still be syntax-checked and reviewed, but full EoMT inference cannot be reproduced.
+
+To run Step 8 locally, create an environment first. Since `conda` is not available right now, either install Miniconda as described in `eomt/README.md`, or create another stable Python environment that can install PyTorch and EoMT dependencies.
+
+Recommended local setup:
+
+```bash
+# Option A: Conda/Miniconda, matching eomt/README.md
+conda create -n eomt python==3.13.2
+conda activate eomt
+python -m pip install -r eomt/requirements.txt
+
+# Option B: stable venv if a suitable Python is installed
+python3 -m venv .venv
+source .venv/bin/activate
+python -m pip install --upgrade pip
+python -m pip install -r eomt/requirements.txt
+```
+
+After setup, verify:
+
+```bash
+python - <<'PY'
+import torch
+import torchvision
+import lightning
+from PIL import Image
+import sklearn
+print("torch", torch.__version__)
+print("cuda available:", torch.cuda.is_available())
+print("mps available:", hasattr(torch.backends, "mps") and torch.backends.mps.is_available())
+PY
+```
+
+On this local macOS machine, CUDA should not be assumed. If PyTorch MPS is available, the runner can optionally support `--device mps`; otherwise use CPU for smoke tests. Full EoMT anomaly evaluation on CPU may be very slow, especially for all datasets and all checkpoints.
+
+Recommended execution strategy:
+
+1. Develop the pipeline as terminal scripts.
+2. Run small CPU/MPS smoke tests locally:
+   - one checkpoint,
+   - one dataset,
+   - one or two images,
+   - one method such as `maxlogit`.
+3. If local runtime is too slow, run the same script on JupyterLab, Colab, or another GPU machine. Do not rewrite the pipeline as notebook-only code.
+4. Keep notebooks only for visualization or exploratory debugging.
+
+The README should show that Step 8 can be launched from the terminal, for example:
+
+```bash
+python step8_eomt_mask_baselines/run_eomt_anomaly.py \
+  --config eomt/configs/dinov2/coco/panoptic/eomt_base_640_2x.yaml \
+  --checkpoint /local/path/eomt_coco.bin \
+  --dataset-root /local/path/Anomaly_Validation_Datasets \
+  --dataset RoadAnomaly21 \
+  --method maxlogit \
+  --max-images 2 \
+  --device cpu
+```
+
+Use `--max-images` for local smoke tests, then remove it for full evaluation.
+
+### 1. Anomaly Validation Datasets
+
+Download and prepare `Anomaly_Validation_Datasets.zip` from the project Drive link referenced in `eval/README.md`.
+
+Do not add the zip file or extracted dataset to git. Keep it somewhere local, for example:
+
+```text
+/Users/sadaf/outlierdrive/Anomaly_Validation_Datasets/
+```
+
+or:
+
+```text
+/Users/sadaf/datasets/Anomaly_Validation_Datasets/
+```
+
+The runner should take this location as `--dataset-root`.
+
+Expected dataset folders:
+
+- `RoadAnomaly21`
+- `RoadObsticle21`
+- `RoadAnomaly`
+- `fs_static`
+- `FS_LostFound_full`
+
+Expected structure per dataset:
+
+```text
+<dataset_root>/<dataset_name>/images/*
+<dataset_root>/<dataset_name>/labels_masks/*
+```
+
+The exact image extensions differ by dataset:
+
+- `RoadAnomaly21`: usually `.png`
+- `RoadObsticle21`: usually `.webp`
+- `RoadAnomaly`: usually `.jpg`
+- `fs_static`: usually `.jpg`
+- `FS_LostFound_full`: check actual folder contents
+
+### 2. EoMT Checkpoints
+
+The Step 8 table requires all three:
+
+```text
+eomt_checkpoints/eomt_coco.bin
+eomt_checkpoints/eomt_cityscapes.bin
+eomt_checkpoints/eomt_finetuned.bin
+```
+
+Names can differ, but the runner should accept explicit paths through CLI args.
+
+Do not commit checkpoint files. They are large model artifacts and `.gitignore` excludes `*.bin`, `*.pth`, `*.pt`, and checkpoint directories.
+
+### 3. EoMT Configs
+
+Use the config that matches each checkpoint:
+
+- COCO checkpoint:
+  - `eomt/configs/dinov2/coco/panoptic/eomt_base_640_2x.yaml`
+- Cityscapes checkpoint:
+  - `eomt/configs/dinov2/cityscapes/semantic/eomt_base_640.yaml`
+- Fine-tuned checkpoint:
+  - Usually the same Cityscapes semantic config unless fine-tuning used a different config.
+
+For inference, set:
+
+```text
+--model.network.masked_attn_enabled False
+```
+
+or instantiate the network with `masked_attn_enabled=False`, matching `eomt/README.md`.
+
+## Deliverables
+
+Create a dedicated Step 8 folder:
+
+```text
+step8_eomt_mask_baselines/
+```
+
+Recommended files:
+
+```text
+step8_eomt_mask_baselines/
+  README.md
+  run_eomt_anomaly.py
+  run_all_eomt_anomaly.sh
+  eomt_anomaly_results.csv
+  eomt_temperature_results.csv
+  saved_logits/                 # optional; keep out of git if large
+```
+
+Expected CSV format:
+
+```csv
+model,checkpoint,dataset,method,miou,auprc,fpr95
+EoMT,eomt_coco,RoadAnomaly21,msp,,,
+EoMT,eomt_coco,RoadAnomaly21,maxlogit,,,
+EoMT,eomt_coco,RoadAnomaly21,entropy,,,
+EoMT,eomt_coco,RoadAnomaly21,rba,,,
+```
+
+`miou` should be filled once per checkpoint and repeated across rows for convenience, or stored in a separate `eomt_miou_results.csv`.
+
+## Implementation Plan
+
+### Task 1: Recover the Step 7 Runner Pattern
+
+Inspect the Step 7 runner from `origin/feature/erfnet-baselines` as a read-only template. Do not merge or cherry-pick it.
+
+Reference command:
+
+```bash
+git show origin/feature/erfnet-baselines:step7_erfnet_pixel_baselines/run_erfnet_anomaly_cpu.py
+```
+
+Port these pieces into the Step 8 runner:
+
+- `fpr_at_95_tpr`
+- CSV appending
+- anomaly image globbing
+- ground-truth path inference
+- ground-truth mask normalization
+- `average_precision_score`
+- excluding ignored pixels
+- one row per dataset and method
+
+Do not modify the original `eval/evalAnomaly.py`. Keep Step 8 code separate.
+
+### Task 2: Implement Dataset Ground Truth Utilities
+
+Reuse the same logic as Step 7.
+
+The binary convention should be:
+
+```text
+1 = anomaly / OOD pixel
+0 = in-distribution pixel
+255 = ignore pixel
+```
+
+Ground-truth path inference:
+
+- Replace `images` with `labels_masks`.
+- For `RoadObsticle21`, replace `.webp` with `.png`.
+- For `fs_static`, replace `.jpg` with `.png`.
+- For `RoadAnomaly`, replace `.jpg` with `.png`.
+
+Ground-truth normalization:
+
+- `RoadAnomaly`: convert label `2` to anomaly `1`.
+- `LostAndFound`: convert `0` to ignore `255`, `1` to inlier `0`, and labels between `2` and `200` to anomaly `1`.
+- `Streethazard`: keep the existing handling from `eval/evalAnomaly.py` if this dataset appears.
+
+Skip images that do not contain anomaly pixels, matching Step 7 behavior.
+
+### Task 3: Implement EoMT Model Loading
+
+Use the EoMT loading and sliding-window inference logic from the Step 4 notebooks and scripts rather than ERFNet.
+
+The loader should:
+
+- Read a YAML config.
+- Instantiate the configured data/model classes as needed.
+- Instantiate the encoder and EoMT network.
+- Instantiate the Lightning module.
+- Load checkpoint weights.
+- Move model to `cuda` if available, else CPU.
+- Set `model.eval()`.
+- Disable masked attention during inference.
+
+Keep the CLI flexible:
+
+```bash
+python step8_eomt_mask_baselines/run_eomt_anomaly.py \
+  --config eomt/configs/dinov2/coco/panoptic/eomt_base_640_2x.yaml \
+  --checkpoint eomt_checkpoints/eomt_coco.bin \
+  --checkpoint-name eomt_coco \
+  --dataset-root /path/to/Anomaly_Validation_Datasets \
+  --dataset RoadAnomaly21 \
+  --method maxlogit \
+  --output-csv step8_eomt_mask_baselines/eomt_anomaly_results.csv
+```
+
+The runner should also allow:
+
+```text
+--device cuda
+--device cpu
+--save-logits
+--logits-dir step8_eomt_mask_baselines/saved_logits
+--temperature 1.0
+```
+
+### Task 4: Implement EoMT Forward Pass
+
+ERFNet returns per-pixel outputs directly. EoMT returns mask logits and class logits.
+
+EoMT output shape:
+
+```text
+mask_logits:  [B, Q, H, W]
+class_logits: [B, Q, C + 1]
+```
+
+The last class is the no-object class.
+
+For semantic pixel logits, use EoMT's existing function:
+
+```python
+pixel_logits = model.to_per_pixel_logits_semantic(mask_logits, class_logits)
+```
+
+This is equivalent to:
+
+```python
+pixel_logits = torch.einsum(
+    "bqhw,bqc->bchw",
+    mask_logits.sigmoid(),
+    class_logits.softmax(dim=-1)[..., :-1],
+)
+```
+
+For full-size anomaly images, follow the inference notebook pattern:
+
+```python
+imgs = [image_tensor.to(device)]
+img_sizes = [image_tensor.shape[-2:]]
+crops, origins = model.window_imgs_semantic(imgs)
+mask_logits_per_layer, class_logits_per_layer = model(crops)
+mask_logits = F.interpolate(mask_logits_per_layer[-1], model.img_size, mode="bilinear")
+crop_logits = model.to_per_pixel_logits_semantic(mask_logits, class_logits_per_layer[-1])
+logits = model.revert_window_logits_semantic(crop_logits, origins, img_sizes)
+pixel_logits = logits[0]
+```
+
+Use `torch.no_grad()`. Use AMP/autocast on CUDA if stable:
+
+```python
+with torch.no_grad(), torch.amp.autocast("cuda", enabled=(device.type == "cuda")):
+    ...
+```
+
+### Task 5: Implement Anomaly Scoring Methods
+
+All scoring methods must return a `[H, W]` numpy array where higher score means more anomalous.
+
+#### MSP
+
+```python
+probs = torch.softmax(pixel_logits / temperature, dim=0)
+score = 1.0 - probs.max(dim=0).values
+```
+
+#### MaxLogit
+
+```python
+score = -pixel_logits.max(dim=0).values
+```
+
+If temperature is applied to MaxLogit, document it clearly. The required temperature experiment in the PDF is for MSP, so keep MaxLogit unscaled unless intentionally experimenting.
+
+#### Max Entropy
+
+```python
+probs = torch.softmax(pixel_logits / temperature, dim=0)
+score = -(probs * torch.log(probs + 1e-12)).sum(dim=0)
+```
+
+#### RbA
+
+RbA is the mask-specific method. It should use query-level mask and class outputs, not only the final aggregated pixel logits.
+
+Concept:
+
+- Each object query acts like an independent one-vs-all classifier.
+- Known pixels receive confident votes from at least one known-class query.
+- Unknown pixels are "rejected by all" known classes.
+
+Implementation direction:
+
+```python
+mask_probs = mask_logits.sigmoid()                       # [B, Q, H, W]
+class_probs = class_logits.softmax(dim=-1)[..., :-1]     # [B, Q, C]
+known_scores = torch.einsum("bqhw,bqc->bchw", mask_probs, class_probs)
+known_scores = known_scores.clamp(0.0, 1.0)
+rba_score = torch.prod(1.0 - known_scores, dim=1)        # [B, H, W]
+```
+
+Use log-space for numerical stability if needed:
+
+```python
+rba_score = torch.exp(torch.log1p(-known_scores.clamp(max=1 - 1e-6)).sum(dim=1))
+```
+
+Important: verify this against the official RbA repository before final reporting:
+
+- https://github.com/NazirNayal8/RbA
+- Check their `evaluate_ood.py` and scoring utilities.
+
+If the exact RbA implementation differs, match the official implementation and document the formula in `step8_eomt_mask_baselines/README.md`.
+
+### Task 6: Compute Metrics
+
+After collecting all images for one dataset/method/checkpoint:
+
+```python
+ood_mask = gt == 1
+ind_mask = gt == 0
+
+scores = np.concatenate([anomaly_scores[ind_mask], anomaly_scores[ood_mask]])
+labels = np.concatenate([
+    np.zeros(num_ind_pixels),
+    np.ones(num_ood_pixels),
+])
+
+auprc = average_precision_score(labels, scores) * 100.0
+fpr95 = fpr_at_95_tpr(scores, labels) * 100.0
+```
+
+Ignore pixels with GT value `255`.
+
+Use the same metric implementation for all methods and checkpoints.
+
+### Task 7: Evaluate All Required Datasets
+
+Run each method on each dataset:
+
+```text
+RoadAnomaly21
+RoadObsticle21
+RoadAnomaly
+fs_static
+FS_LostFound_full
+```
+
+Methods:
+
+```text
+msp
+maxlogit
+entropy
+rba
+```
+
+Checkpoints:
+
+```text
+eomt_coco
+eomt_cityscapes
+eomt_finetuned
+```
+
+Total required anomaly runs:
+
+```text
+3 checkpoints x 5 datasets x 4 methods = 60 rows
+```
+
+### Task 8: Compute mIoU Per Checkpoint
+
+The PDF table asks for mIoU for EoMT checkpoints. This mIoU does not change by post-hoc method.
+
+Compute mIoU on Cityscapes validation for each checkpoint:
+
+- COCO-trained EoMT.
+- Cityscapes-trained EoMT.
+- Fine-tuned EoMT.
+
+Use the same semantic conversion pipeline:
+
+```python
+pixel_logits = model.to_per_pixel_logits_semantic(mask_logits, class_logits)
+prediction = pixel_logits.argmax(dim=0)
+```
+
+For COCO, be careful because the class space differs from Cityscapes. Use the same mapping/evaluation strategy already used in the earlier EoMT Cityscapes evaluation notebooks.
+
+Save:
+
+```text
+step8_eomt_mask_baselines/eomt_miou_results.csv
+```
+
+Recommended columns:
+
+```csv
+checkpoint,config,miou,notes
+```
+
+### Task 9: Add Temperature Scaling
+
+The PDF asks for a temperature scaling baseline.
+
+Recommended approach:
+
+1. Run EoMT once per image/checkpoint/dataset and save reusable logits.
+2. Recompute MSP with different temperatures without running model forward again.
+3. Try at least:
+
+```text
+T = 0.5
+T = 0.75
+T = 1.0
+T = 1.1
+```
+
+Optional extended grid:
+
+```text
+T = 0.25, 0.5, 0.75, 1.0, 1.1, 1.25, 1.5, 2.0
+```
+
+MSP with temperature:
+
+```python
+probs = torch.softmax(pixel_logits / temperature, dim=0)
+score = 1.0 - probs.max(dim=0).values
+```
+
+Save:
+
+```text
+step8_eomt_mask_baselines/eomt_temperature_results.csv
+```
+
+Recommended columns:
+
+```csv
+checkpoint,dataset,method,temperature,auprc,fpr95
+```
+
+### Task 10: Write the Step 8 README
+
+Create `step8_eomt_mask_baselines/README.md` with:
+
+- What Step 8 evaluates.
+- How to install/run.
+- Required checkpoints and datasets.
+- Commands for one dataset and all datasets.
+- Explanation of MSP, MaxLogit, Max Entropy, and RbA.
+- Output CSV locations.
+- Known limitations.
+
+## Suggested Command Layout
+
+One run:
+
+```bash
+python step8_eomt_mask_baselines/run_eomt_anomaly.py \
+  --config eomt/configs/dinov2/coco/panoptic/eomt_base_640_2x.yaml \
+  --checkpoint eomt_checkpoints/eomt_coco.bin \
+  --checkpoint-name eomt_coco \
+  --dataset-root /path/to/Anomaly_Validation_Datasets \
+  --dataset RoadAnomaly21 \
+  --method rba \
+  --output-csv step8_eomt_mask_baselines/eomt_anomaly_results.csv
+```
+
+All runs should be wrapped in:
+
+```bash
+bash step8_eomt_mask_baselines/run_all_eomt_anomaly.sh
+```
+
+The shell script should loop over:
+
+- 3 checkpoints/configs.
+- 5 datasets.
+- 4 methods.
+
+## Final Report Table Shape
+
+The report should include a table like:
+
+```text
+Model / checkpoint | mIoU | Method | RA-21 AuPRC | RA-21 FPR95 | RO-21 AuPRC | RO-21 FPR95 | FS L&F AuPRC | FS L&F FPR95 | FS Static AuPRC | FS Static FPR95 | Road Anomaly AuPRC | Road Anomaly FPR95
+```
+
+Rows:
+
+- `EoMT COCO - MSP`
+- `EoMT COCO - MaxLogit`
+- `EoMT COCO - Max Entropy`
+- `EoMT COCO - RbA`
+- `EoMT Cityscapes - MSP`
+- `EoMT Cityscapes - MaxLogit`
+- `EoMT Cityscapes - Max Entropy`
+- `EoMT Cityscapes - RbA`
+- `EoMT Fine-tuned - MSP`
+- `EoMT Fine-tuned - MaxLogit`
+- `EoMT Fine-tuned - Max Entropy`
+- `EoMT Fine-tuned - RbA`
+
+Add the ERFNet Step 7 rows separately if the final project table combines Step 7 and Step 8.
+
+## Validation Checklist
+
+Before considering Step 8 complete:
+
+- [ ] `run_eomt_anomaly.py` runs on one image without crashing.
+- [ ] The runner processes one full dataset with `maxlogit`.
+- [ ] MSP, MaxLogit, and Max Entropy produce non-constant anomaly maps.
+- [ ] RbA uses query-level mask/class outputs and has been checked against the official RbA implementation.
+- [ ] Ignored pixels (`255`) are excluded from metrics.
+- [ ] Ground-truth masks are correctly found for all five datasets.
+- [ ] All 60 anomaly result rows are present.
+- [ ] mIoU is computed once per checkpoint.
+- [ ] Temperature scaling results are saved, if included.
+- [ ] CSVs are deterministic enough to rerun and compare.
+- [ ] Large saved logits/checkpoints/datasets are not committed to git.
+
+## Main Risks
+
+- Cityscapes config may be missing on the current branch. Add the needed config directly to this branch from official/project materials before Cityscapes and fine-tuned checkpoint evaluation.
+- COCO and Cityscapes class spaces differ. Use the prior mapping/evaluation strategy for COCO mIoU.
+- EoMT image preprocessing must match the notebook/config. Wrong resizing or normalization will make results meaningless.
+- RbA should not be approximated from only final pixel logits unless explicitly documented as an approximation. It is intended for mask-query outputs.
+- Full-size anomaly images can be memory heavy. Use batch size 1, sliding-window inference, and AMP on CUDA.
+- Temperature scaling should reuse saved logits to avoid repeating expensive model inference.
+
+## Recommended Implementation Order
+
+1. Inspect the Step 7 runner from `origin/feature/erfnet-baselines` as read-only reference material.
+2. Create `step8_eomt_mask_baselines/run_eomt_anomaly.py`.
+3. Get EoMT loading and one-image inference working.
+4. Add per-pixel logits conversion and `maxlogit`.
+5. Add MSP and Max Entropy.
+6. Add GT loading and metrics.
+7. Run one full dataset for one checkpoint.
+8. Implement and verify RbA.
+9. Add all-dataset/all-checkpoint runner.
+10. Add mIoU computation.
+11. Add temperature scaling with saved logits.
+12. Write `step8_eomt_mask_baselines/README.md`.
+13. Generate final CSVs and report-ready tables.
diff --git a/step8_eomt_mask_baselines/README.md b/step8_eomt_mask_baselines/README.md
new file mode 100644
index 0000000..1227808
--- /dev/null
+++ b/step8_eomt_mask_baselines/README.md
@@ -0,0 +1,138 @@
+# Step 8 - EoMT Mask-Based Anomaly Baselines
+
+Implementation of  Step 8 : evaluate EoMT on the same
+anomaly validation datasets used for the ERFNet pixel-based baselines.
+
+ Pixel level anomaly segmentation metrics for:
+
+- MSP
+- MaxLogit
+- Max Entropy
+- RbA-style mask-query rejection
+
+It is designed to evaluate all required EoMT checkpoints:
+
+- COCO-trained EoMT
+- Cityscapes-trained EoMT
+- Fine-tuned EoMT from Step 5
+
+## Required Local Inputs
+
+
+- `Anomaly_Validation_Datasets/`
+- EoMT checkpoint `.bin` files
+- The EoMT config matching each checkpoint
+
+
+The model construction and sliding-window inference path follows the Step 4
+notebooks on this branch, especially:
+
+```text
+step4_eomt_eval/test_eval_pipeline/predictions_eomt_city.ipynb
+step4_eomt_eval/test_eval_pipeline/predictions_eomt_coco.ipynb
+```
+
+The runner does not instantiate EoMT dataset modules for anomaly evaluation.
+This is intentional because Step 8 reads images and anomaly masks directly from
+`Validation_Dataset/`. Pass `--img-size` and `--num-classes` explicitly only if
+checkpoint-based inference is not possible.
+By default the runner infers image size, number of classes, and query count from
+the checkpoint tensors.
+
+
+## Reporting Notes
+
+The final anomaly benchmark is:
+
+This CSV contains 60 anomaly-evaluation rows:
+
+
+
+The `miou` column is intentionally empty in the anomaly CSV. If the report table
+requires mIoU, fill it from the Step 4/5 semantic segmentation evaluation on
+Cityscapes, not from the anomaly segmentation run.
+
+The final anomaly CSV only uses `temperature=1.0`. Temperature scaling is
+implemented and smoke-tested separately, but it is not part of
+`eomt_anomaly_results.csv` unless the temperature sweep is run and reported as an
+extra baseline.
+
+Additional result files:
+
+`eomt_temperature_results.csv` contains the 60-row MSP temperature-scaling sweep:
+3 checkpoints x 5 datasets x 4 temperatures. `eomt_all_results.csv` keeps both
+the 60-row anomaly baseline table and the 60-row temperature-scaling table in one
+file, with a `result_group` column to distinguish the original anomaly baselines
+from the temperature-scaling rows.
+
+The dataset folder is named `RoadObsticle21` in the provided validation archive.
+The raw CSV keeps that exact folder name, but the report-ready table uses the
+clean display name `RoadObstacle21`.
+
+Some results can look surprisingly strong or weak across datasets. Treat these
+as values to double-check in the report discussion, especially:
+
+- `eomt_cityscapes` on `RoadObstacle21` (`RoadObsticle21` folder)
+- `eomt_finetuned` on `fs_static`
+- `eomt_finetuned` FPR95 on `RoadAnomaly`
+
+## mIoU CSV
+
+If checkpoint predictions are already exported as Cityscapes `labelTrainIds`
+PNG masks, compute mIoU with:
+
+```bash
+python step8_eomt_mask_baselines/compute_cityscapes_miou.py \
+  --checkpoint-name eomt_cityscapes \
+  --config eomt/configs/dinov2/cityscapes/semantic/eomt_base_640.yaml \
+  --gt-glob "/local/path/gtFine_trainIds/val/*/*_gtFine_labelTrainIds.png" \
+  --pred-dir /local/path/eomt_cityscapes_predictions/png \
+  --output-csv step8_eomt_mask_baselines/eomt_miou_results.csv
+```
+
+Pass the resulting mIoU value back to anomaly runs with `--miou`.
+
+
+Datasets:
+
+- `RoadAnomaly21`
+- `RoadObsticle21`
+- `RoadAnomaly`
+- `fs_static`
+- `FS_LostFound_full`
+
+Methods:
+
+- `msp`
+- `maxlogit`
+- `entropy`
+- `rba`
+
+## Temperature Scaling
+
+Temperature scaling is supported through `--temperature`. For the PDF table, use
+MSP with values such as:
+
+```text
+0.5, 0.75, 1.0, 1.1
+```
+By default it evaluates MSP temperature
+scaling for all three checkpoints and all five anomaly datasets at
+`0.5, 0.75, 1.0, 1.1`, producing 60 rows:
+
+
+The final temperature-scaling rows should be reported from
+`eomt_temperature_results.csv` or another full sweep output. The small
+`eomt_temperature_smoke_results.csv` file is only a sanity check.
+
+## Method Notes
+
+MSP, MaxLogit, and Max Entropy use semantic per-pixel logits produced by EoMT's
+existing `to_per_pixel_logits_semantic` helper.
+
+RbA uses query-level mask and class predictions before the final semantic
+aggregation. The implementation follows the mask-architecture idea of rejecting
+pixels that are not confidently accepted by any known query/class. Before final
+reporting, compare this formula against the official RbA implementation and
+document any difference.
+
diff --git a/step8_eomt_mask_baselines/compute_cityscapes_miou.py b/step8_eomt_mask_baselines/compute_cityscapes_miou.py
new file mode 100755
index 0000000..e3daf5f
--- /dev/null
+++ b/step8_eomt_mask_baselines/compute_cityscapes_miou.py
@@ -0,0 +1,197 @@
+#!/usr/bin/env python3
+"""Compute Cityscapes mIoU from saved prediction masks.
+
+Use this helper when EoMT predictions have already been exported as PNG masks.
+It mirrors the Step 5 notebook metric logic and writes a small CSV that can be
+joined with the Step 8 anomaly results table.
+"""
+
+from __future__ import annotations
+
+import argparse
+import csv
+import glob
+from pathlib import Path
+
+
+CITYSCAPES_CLASSES = [
+    "road",
+    "sidewalk",
+    "building",
+    "wall",
+    "fence",
+    "pole",
+    "traffic light",
+    "traffic sign",
+    "vegetation",
+    "terrain",
+    "sky",
+    "person",
+    "rider",
+    "car",
+    "truck",
+    "bus",
+    "train",
+    "motorcycle",
+    "bicycle",
+]
+
+
+def load_mask(path: Path) -> np.ndarray:
+    import numpy as np
+    from PIL import Image
+
+    mask = np.array(Image.open(path), dtype=np.int64)
+    if mask.ndim == 3:
+        mask = mask[..., 0]
+    return mask
+
+
+def confusion_matrix(gt: np.ndarray, pred: np.ndarray, num_classes: int, ignore_index: int) -> np.ndarray:
+    import numpy as np
+
+    valid = (
+        (gt != ignore_index)
+        & (gt >= 0)
+        & (gt < num_classes)
+        & (pred >= 0)
+        & (pred < num_classes)
+    )
+    return np.bincount(
+        num_classes * gt[valid] + pred[valid],
+        minlength=num_classes * num_classes,
+    ).reshape(num_classes, num_classes)
+
+
+def prediction_path_for_gt(pred_dir: Path, gt_path: Path) -> Path:
+    gt_filename = gt_path.name
+    candidates = [
+        gt_filename.replace("_gtFine_labelTrainIds.png", "_predTrainIds.png"),
+        gt_filename.replace("_gtFine_labelTrainIds.png", ".png"),
+        gt_filename,
+    ]
+    for filename in candidates:
+        candidate = pred_dir / filename
+        if candidate.exists():
+            return candidate
+    return pred_dir / candidates[0]
+
+
+def compute_miou(
+    *,
+    gt_glob: str,
+    pred_dir: Path,
+    num_classes: int,
+    ignore_index: int,
+) -> tuple[float, float, np.ndarray, list[Path]]:
+    import numpy as np
+
+    gt_paths = sorted(Path(path) for path in glob.glob(gt_glob) if Path(path).is_file())
+    if not gt_paths:
+        raise FileNotFoundError(f"No ground-truth masks found for glob: {gt_glob}")
+
+    total_hist = np.zeros((num_classes, num_classes), dtype=np.float64)
+    missing_predictions: list[Path] = []
+
+    for gt_path in gt_paths:
+        pred_path = prediction_path_for_gt(pred_dir, gt_path)
+        if not pred_path.exists():
+            missing_predictions.append(pred_path)
+            continue
+
+        gt = load_mask(gt_path)
+        pred = load_mask(pred_path)
+        if gt.shape != pred.shape:
+            raise ValueError(f"Shape mismatch for {pred_path}: pred {pred.shape}, gt {gt.shape}")
+        total_hist += confusion_matrix(gt, pred, num_classes, ignore_index)
+
+    intersection = np.diag(total_hist)
+    union = total_hist.sum(axis=1) + total_hist.sum(axis=0) - intersection
+    class_iou = intersection / np.maximum(union, 1)
+    mean_iou = float(np.mean(class_iou) * 100.0)
+    pixel_accuracy = float(intersection.sum() / np.maximum(total_hist.sum(), 1) * 100.0)
+    return mean_iou, pixel_accuracy, class_iou * 100.0, missing_predictions
+
+
+def write_summary(
+    *,
+    output_csv: Path,
+    checkpoint: str,
+    config: str,
+    miou: float,
+    pixel_accuracy: float,
+    missing_predictions: int,
+) -> None:
+    output_csv.parent.mkdir(parents=True, exist_ok=True)
+    write_header = not output_csv.exists()
+    with output_csv.open("a", newline="", encoding="utf-8") as file:
+        writer = csv.DictWriter(
+            file,
+            fieldnames=[
+                "checkpoint",
+                "config",
+                "miou",
+                "pixel_accuracy",
+                "missing_predictions",
+            ],
+        )
+        if write_header:
+            writer.writeheader()
+        writer.writerow(
+            {
+                "checkpoint": checkpoint,
+                "config": config,
+                "miou": miou,
+                "pixel_accuracy": pixel_accuracy,
+                "missing_predictions": missing_predictions,
+            }
+        )
+
+
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--checkpoint-name", required=True)
+    parser.add_argument("--config", default="")
+    parser.add_argument("--gt-glob", required=True, help="Glob for Cityscapes val labelTrainIds masks.")
+    parser.add_argument("--pred-dir", required=True, type=Path, help="Directory containing prediction PNG masks.")
+    parser.add_argument("--num-classes", type=int, default=19)
+    parser.add_argument("--ignore-index", type=int, default=255)
+    parser.add_argument("--output-csv", type=Path, default=Path("step8_eomt_mask_baselines/eomt_miou_results.csv"))
+    parser.add_argument("--per-class-csv", type=Path)
+    return parser.parse_args()
+
+
+def main() -> None:
+    args = parse_args()
+    miou, pixel_accuracy, class_iou, missing = compute_miou(
+        gt_glob=args.gt_glob,
+        pred_dir=args.pred_dir,
+        num_classes=args.num_classes,
+        ignore_index=args.ignore_index,
+    )
+    write_summary(
+        output_csv=args.output_csv,
+        checkpoint=args.checkpoint_name,
+        config=args.config,
+        miou=miou,
+        pixel_accuracy=pixel_accuracy,
+        missing_predictions=len(missing),
+    )
+
+    if args.per_class_csv:
+        args.per_class_csv.parent.mkdir(parents=True, exist_ok=True)
+        with args.per_class_csv.open("w", newline="", encoding="utf-8") as file:
+            writer = csv.writer(file)
+            writer.writerow(["class_id", "class_name", "iou"])
+            for class_id, iou in enumerate(class_iou):
+                class_name = CITYSCAPES_CLASSES[class_id] if class_id < len(CITYSCAPES_CLASSES) else str(class_id)
+                writer.writerow([class_id, class_name, iou])
+
+    print(
+        f"{args.checkpoint_name}: mIoU={miou:.4f}, "
+        f"pixel_accuracy={pixel_accuracy:.4f}, missing_predictions={len(missing)}"
+    )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/step8_eomt_mask_baselines/eomt_all_results.csv b/step8_eomt_mask_baselines/eomt_all_results.csv
new file mode 100644
index 0000000..460861c
--- /dev/null
+++ b/step8_eomt_mask_baselines/eomt_all_results.csv
@@ -0,0 +1,121 @@
+result_group,model,checkpoint,dataset,method,temperature,miou,auprc,fpr95,num_images,num_ood_pixels,num_ind_pixels
+anomaly_baselines,EoMT,eomt_coco,RoadAnomaly21,msp,1.0,,38.71805117931566,77.52158542523586,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_coco,RoadAnomaly21,maxlogit,1.0,,39.02551286843461,77.03345064614419,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_coco,RoadAnomaly21,entropy,1.0,,43.08147659955639,63.545987545575834,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_coco,RoadAnomaly21,rba,1.0,,41.488960376064135,71.68304955245624,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_coco,RoadObsticle21,msp,1.0,,2.7516464707493493,99.97984944130613,30,129782,19106170
+anomaly_baselines,EoMT,eomt_coco,RoadObsticle21,maxlogit,1.0,,2.7464072718046686,99.98043563937723,30,129782,19106170
+anomaly_baselines,EoMT,eomt_coco,RoadObsticle21,entropy,1.0,,8.288580376786761,99.98311016807659,30,129782,19106170
+anomaly_baselines,EoMT,eomt_coco,RoadObsticle21,rba,1.0,,2.7704540952022634,99.94519571426403,30,129782,19106170
+anomaly_baselines,EoMT,eomt_coco,RoadAnomaly,msp,1.0,,18.84913965664081,91.84777799663615,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_coco,RoadAnomaly,maxlogit,1.0,,19.007785043618085,91.57654204838246,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_coco,RoadAnomaly,entropy,1.0,,21.517097341625607,87.88909926174514,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_coco,RoadAnomaly,rba,1.0,,21.304803231323415,82.39053414825683,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_coco,fs_static,msp,1.0,,4.659001916442816,99.48092117751997,20,732524,34420206
+anomaly_baselines,EoMT,eomt_coco,fs_static,maxlogit,1.0,,4.666524546335602,99.4622199530125,20,732524,34420206
+anomaly_baselines,EoMT,eomt_coco,fs_static,entropy,1.0,,4.567454599694478,99.32666294908287,20,732524,34420206
+anomaly_baselines,EoMT,eomt_coco,fs_static,rba,1.0,,4.077273696354527,93.92522810583992,20,732524,34420206
+anomaly_baselines,EoMT,eomt_coco,FS_LostFound_full,msp,1.0,,3.7893197100075646,97.20518225498047,99,477664,168408155
+anomaly_baselines,EoMT,eomt_coco,FS_LostFound_full,maxlogit,1.0,,3.7790941803294524,97.2029567095489,99,477664,168408155
+anomaly_baselines,EoMT,eomt_coco,FS_LostFound_full,entropy,1.0,,3.7899036765901006,96.9060571918266,99,477664,168408155
+anomaly_baselines,EoMT,eomt_coco,FS_LostFound_full,rba,1.0,,2.823879131172171,95.17003852930995,99,477664,168408155
+anomaly_baselines,EoMT,eomt_cityscapes,RoadAnomaly21,msp,1.0,,68.09563873063868,30.422128178431596,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_cityscapes,RoadAnomaly21,maxlogit,1.0,,67.49119536018513,31.590571692115233,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_cityscapes,RoadAnomaly21,entropy,1.0,,68.37170536621619,30.61906590499114,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_cityscapes,RoadAnomaly21,rba,1.0,,67.87210532148492,31.591587787327313,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_cityscapes,RoadObsticle21,msp,1.0,,94.18765038906982,0.36658838479925593,30,129782,19106170
+anomaly_baselines,EoMT,eomt_cityscapes,RoadObsticle21,maxlogit,1.0,,94.22573713395946,0.3625164017696901,30,129782,19106170
+anomaly_baselines,EoMT,eomt_cityscapes,RoadObsticle21,entropy,1.0,,94.28295549674934,0.34885589314865306,30,129782,19106170
+anomaly_baselines,EoMT,eomt_cityscapes,RoadObsticle21,rba,1.0,,94.17803789166507,0.338356666982446,30,129782,19106170
+anomaly_baselines,EoMT,eomt_cityscapes,RoadAnomaly,msp,1.0,,71.35136317226491,15.447183143162249,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_cityscapes,RoadAnomaly,maxlogit,1.0,,70.76849199382829,15.062925195348159,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_cityscapes,RoadAnomaly,entropy,1.0,,74.19086462787737,14.68882380108786,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_cityscapes,RoadAnomaly,rba,1.0,,70.40796331367206,14.175485546149059,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_cityscapes,fs_static,msp,1.0,,58.1919895951857,43.46787756005877,20,732524,34420206
+anomaly_baselines,EoMT,eomt_cityscapes,fs_static,maxlogit,1.0,,58.37164607207064,46.91512014774112,20,732524,34420206
+anomaly_baselines,EoMT,eomt_cityscapes,fs_static,entropy,1.0,,56.9443329197621,43.803741325662024,20,732524,34420206
+anomaly_baselines,EoMT,eomt_cityscapes,fs_static,rba,1.0,,59.56913284052917,47.054889793512565,20,732524,34420206
+anomaly_baselines,EoMT,eomt_cityscapes,FS_LostFound_full,msp,1.0,,16.393849875477365,13.012709509227745,99,477664,168408155
+anomaly_baselines,EoMT,eomt_cityscapes,FS_LostFound_full,maxlogit,1.0,,16.377438397172863,12.776338533012252,99,477664,168408155
+anomaly_baselines,EoMT,eomt_cityscapes,FS_LostFound_full,entropy,1.0,,18.962785205018154,12.8271359543129,99,477664,168408155
+anomaly_baselines,EoMT,eomt_cityscapes,FS_LostFound_full,rba,1.0,,16.364439876551273,12.700015031932391,99,477664,168408155
+anomaly_baselines,EoMT,eomt_finetuned,RoadAnomaly21,msp,1.0,,70.76821826915815,13.600856687028159,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_finetuned,RoadAnomaly21,maxlogit,1.0,,69.57143749671721,16.129666945103146,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_finetuned,RoadAnomaly21,entropy,1.0,,68.83429912585008,14.98138698316053,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_finetuned,RoadAnomaly21,rba,1.0,,58.43124564654474,20.511043107509472,10,1317063,7578030
+anomaly_baselines,EoMT,eomt_finetuned,RoadObsticle21,msp,1.0,,25.43563653062372,2.7168815100043595,30,129782,19106170
+anomaly_baselines,EoMT,eomt_finetuned,RoadObsticle21,maxlogit,1.0,,24.453605568870344,2.786775162159658,30,129782,19106170
+anomaly_baselines,EoMT,eomt_finetuned,RoadObsticle21,entropy,1.0,,24.158271028931864,2.783870341360932,30,129782,19106170
+anomaly_baselines,EoMT,eomt_finetuned,RoadObsticle21,rba,1.0,,18.318807018728453,3.3209952596464913,30,129782,19106170
+anomaly_baselines,EoMT,eomt_finetuned,RoadAnomaly,msp,1.0,,52.176485965836974,93.71364099766765,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_finetuned,RoadAnomaly,maxlogit,1.0,,51.55369494533466,93.24204631825737,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_finetuned,RoadAnomaly,entropy,1.0,,51.52712560576027,92.8493650308477,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_finetuned,RoadAnomaly,rba,1.0,,47.40939161259241,79.58764633295759,60,5446413,49849587
+anomaly_baselines,EoMT,eomt_finetuned,fs_static,msp,1.0,,70.01144998195834,13.678174965019094,20,732524,34420206
+anomaly_baselines,EoMT,eomt_finetuned,fs_static,maxlogit,1.0,,71.22701538617534,13.473885077852236,20,732524,34420206
+anomaly_baselines,EoMT,eomt_finetuned,fs_static,entropy,1.0,,71.60368061303286,12.617245811951271,20,732524,34420206
+anomaly_baselines,EoMT,eomt_finetuned,fs_static,rba,1.0,,71.04355692620362,12.46134320056074,20,732524,34420206
+anomaly_baselines,EoMT,eomt_finetuned,FS_LostFound_full,msp,1.0,,34.33762473053539,15.927807652782612,99,477664,168408155
+anomaly_baselines,EoMT,eomt_finetuned,FS_LostFound_full,maxlogit,1.0,,33.917335559180536,17.21182860770608,99,477664,168408155
+anomaly_baselines,EoMT,eomt_finetuned,FS_LostFound_full,entropy,1.0,,34.87416868566116,23.823490614216396,99,477664,168408155
+anomaly_baselines,EoMT,eomt_finetuned,FS_LostFound_full,rba,1.0,,24.802463227076196,19.11074852640004,99,477664,168408155
+temperature_scaling,EoMT,eomt_coco,RoadAnomaly21,msp_t0.5,0.5,,38.457809246908035,78.23220546764792,10,1317063,7578030
+temperature_scaling,EoMT,eomt_coco,RoadAnomaly21,msp_t0.75,0.75,,38.65369360856192,77.6521470619673,10,1317063,7578030
+temperature_scaling,EoMT,eomt_coco,RoadAnomaly21,msp,1.0,,38.71805117931566,77.52158542523586,10,1317063,7578030
+temperature_scaling,EoMT,eomt_coco,RoadAnomaly21,msp_t1.1,1.1,,38.733597305619924,77.49498220513775,10,1317063,7578030
+temperature_scaling,EoMT,eomt_coco,RoadObsticle21,msp_t0.5,0.5,,2.7397144031341454,99.97952493880248,30,129782,19106170
+temperature_scaling,EoMT,eomt_coco,RoadObsticle21,msp_t0.75,0.75,,2.748358824815808,99.97977616654725,30,129782,19106170
+temperature_scaling,EoMT,eomt_coco,RoadObsticle21,msp,1.0,,2.7516464707493493,99.97984944130613,30,129782,19106170
+temperature_scaling,EoMT,eomt_coco,RoadObsticle21,msp_t1.1,1.1,,2.7524835397365157,99.97987561086288,30,129782,19106170
+temperature_scaling,EoMT,eomt_coco,RoadAnomaly,msp_t0.5,0.5,,18.78751202412698,92.01634509028129,60,5446413,49849587
+temperature_scaling,EoMT,eomt_coco,RoadAnomaly,msp_t0.75,0.75,,18.830765733849248,91.89128688267768,60,5446413,49849587
+temperature_scaling,EoMT,eomt_coco,RoadAnomaly,msp,1.0,,18.84913965664081,91.84777799663615,60,5446413,49849587
+temperature_scaling,EoMT,eomt_coco,RoadAnomaly,msp_t1.1,1.1,,18.853153731367705,91.83703768699228,60,5446413,49849587
+temperature_scaling,EoMT,eomt_coco,fs_static,msp_t0.5,0.5,,4.650297868407619,99.50114476363099,20,732524,34420206
+temperature_scaling,EoMT,eomt_coco,fs_static,msp_t0.75,0.75,,4.656531641325685,99.48592114759569,20,732524,34420206
+temperature_scaling,EoMT,eomt_coco,fs_static,msp,1.0,,4.659001916442816,99.48092117751997,20,732524,34420206
+temperature_scaling,EoMT,eomt_coco,fs_static,msp_t1.1,1.1,,4.6595207190819545,99.47975325888521,20,732524,34420206
+temperature_scaling,EoMT,eomt_coco,FS_LostFound_full,msp_t0.5,0.5,,3.780995858312887,97.21943275252912,99,477664,168408155
+temperature_scaling,EoMT,eomt_coco,FS_LostFound_full,msp_t0.75,0.75,,3.786944791596082,97.20641616197267,99,477664,168408155
+temperature_scaling,EoMT,eomt_coco,FS_LostFound_full,msp,1.0,,3.7893197100075646,97.20518225498047,99,477664,168408155
+temperature_scaling,EoMT,eomt_coco,FS_LostFound_full,msp_t1.1,1.1,,3.7899345486801717,97.2058484935008,99,477664,168408155
+temperature_scaling,EoMT,eomt_cityscapes,RoadAnomaly21,msp_t0.5,0.5,,67.94287082795915,30.416005215075685,10,1317063,7578030
+temperature_scaling,EoMT,eomt_cityscapes,RoadAnomaly21,msp_t0.75,0.75,,68.03052477901113,30.419475774046816,10,1317063,7578030
+temperature_scaling,EoMT,eomt_cityscapes,RoadAnomaly21,msp,1.0,,68.09563873063868,30.422128178431596,10,1317063,7578030
+temperature_scaling,EoMT,eomt_cityscapes,RoadAnomaly21,msp_t1.1,1.1,,68.11713492821228,30.42274839239222,10,1317063,7578030
+temperature_scaling,EoMT,eomt_cityscapes,RoadObsticle21,msp_t0.5,0.5,,94.1739371428372,0.3687133528069728,30,129782,19106170
+temperature_scaling,EoMT,eomt_cityscapes,RoadObsticle21,msp_t0.75,0.75,,94.18363437956744,0.3673839393243125,30,129782,19106170
+temperature_scaling,EoMT,eomt_cityscapes,RoadObsticle21,msp,1.0,,94.18765038906982,0.36658838479925593,30,129782,19106170
+temperature_scaling,EoMT,eomt_cityscapes,RoadObsticle21,msp_t1.1,1.1,,94.18862371408119,0.36643136745878424,30,129782,19106170
+temperature_scaling,EoMT,eomt_cityscapes,RoadAnomaly,msp_t0.5,0.5,,71.15141459722764,15.652207910970256,60,5446413,49849587
+temperature_scaling,EoMT,eomt_cityscapes,RoadAnomaly,msp_t0.75,0.75,,71.26776521365635,15.495346029647147,60,5446413,49849587
+temperature_scaling,EoMT,eomt_cityscapes,RoadAnomaly,msp,1.0,,71.35136317226491,15.447183143162249,60,5446413,49849587
+temperature_scaling,EoMT,eomt_cityscapes,RoadAnomaly,msp_t1.1,1.1,,71.38002483822842,15.436242230050972,60,5446413,49849587
+temperature_scaling,EoMT,eomt_cityscapes,fs_static,msp_t0.5,0.5,,58.16846372755191,43.45717744978052,20,732524,34420206
+temperature_scaling,EoMT,eomt_cityscapes,fs_static,msp_t0.75,0.75,,58.18494539856691,43.46418786685937,20,732524,34420206
+temperature_scaling,EoMT,eomt_cityscapes,fs_static,msp,1.0,,58.1919895951857,43.46787756005877,20,732524,34420206
+temperature_scaling,EoMT,eomt_cityscapes,fs_static,msp_t1.1,1.1,,58.19377861751567,43.48152651962629,20,732524,34420206
+temperature_scaling,EoMT,eomt_cityscapes,FS_LostFound_full,msp_t0.5,0.5,,16.381070492665785,13.019064308376278,99,477664,168408155
+temperature_scaling,EoMT,eomt_cityscapes,FS_LostFound_full,msp_t0.75,0.75,,16.388361659181815,13.014767010540554,99,477664,168408155
+temperature_scaling,EoMT,eomt_cityscapes,FS_LostFound_full,msp,1.0,,16.393849875477365,13.012709509227745,99,477664,168408155
+temperature_scaling,EoMT,eomt_cityscapes,FS_LostFound_full,msp_t1.1,1.1,,16.39583058005375,13.011993391887703,99,477664,168408155
+temperature_scaling,EoMT,eomt_finetuned,RoadAnomaly21,msp_t0.5,0.5,,70.75451481813693,13.680230877945851,10,1317063,7578030
+temperature_scaling,EoMT,eomt_finetuned,RoadAnomaly21,msp_t0.75,0.75,,70.76617364311109,13.619661046472501,10,1317063,7578030
+temperature_scaling,EoMT,eomt_finetuned,RoadAnomaly21,msp,1.0,,70.76821826915815,13.600856687028159,10,1317063,7578030
+temperature_scaling,EoMT,eomt_finetuned,RoadAnomaly21,msp_t1.1,1.1,,70.76845109363153,13.596290856594656,10,1317063,7578030
+temperature_scaling,EoMT,eomt_finetuned,RoadObsticle21,msp_t0.5,0.5,,25.427664167015863,2.715876599025341,30,129782,19106170
+temperature_scaling,EoMT,eomt_finetuned,RoadObsticle21,msp_t0.75,0.75,,25.43366349362101,2.719676418664756,30,129782,19106170
+temperature_scaling,EoMT,eomt_finetuned,RoadObsticle21,msp,1.0,,25.43563653062372,2.7168815100043595,30,129782,19106170
+temperature_scaling,EoMT,eomt_finetuned,RoadObsticle21,msp_t1.1,1.1,,25.435995267946687,2.7170385273448314,30,129782,19106170
+temperature_scaling,EoMT,eomt_finetuned,RoadAnomaly,msp_t0.5,0.5,,52.14368550580401,93.86699231831147,60,5446413,49849587
+temperature_scaling,EoMT,eomt_finetuned,RoadAnomaly,msp_t0.75,0.75,,52.16704003241111,93.75005855113704,60,5446413,49849587
+temperature_scaling,EoMT,eomt_finetuned,RoadAnomaly,msp,1.0,,52.176485965836974,93.71364099766765,60,5446413,49849587
+temperature_scaling,EoMT,eomt_finetuned,RoadAnomaly,msp_t1.1,1.1,,52.1788577649354,93.70375525879481,60,5446413,49849587
+temperature_scaling,EoMT,eomt_finetuned,fs_static,msp_t0.5,0.5,,69.93456487361965,13.873667693912118,20,732524,34420206
+temperature_scaling,EoMT,eomt_finetuned,fs_static,msp_t0.75,0.75,,69.98777248602161,13.736355906760117,20,732524,34420206
+temperature_scaling,EoMT,eomt_finetuned,fs_static,msp,1.0,,70.01144998195834,13.678174965019094,20,732524,34420206
+temperature_scaling,EoMT,eomt_finetuned,fs_static,msp_t1.1,1.1,,70.01755818701479,13.66349172924764,20,732524,34420206
+temperature_scaling,EoMT,eomt_finetuned,FS_LostFound_full,msp_t0.5,0.5,,34.23933214525758,15.688113797102046,99,477664,168408155
+temperature_scaling,EoMT,eomt_finetuned,FS_LostFound_full,msp_t0.75,0.75,,34.30795483856447,15.71877383253798,99,477664,168408155
+temperature_scaling,EoMT,eomt_finetuned,FS_LostFound_full,msp,1.0,,34.33762473053539,15.927807652782612,99,477664,168408155
+temperature_scaling,EoMT,eomt_finetuned,FS_LostFound_full,msp_t1.1,1.1,,34.345245238800985,15.984221191663789,99,477664,168408155
diff --git a/step8_eomt_mask_baselines/eomt_anomaly_report_table.csv b/step8_eomt_mask_baselines/eomt_anomaly_report_table.csv
new file mode 100644
index 0000000..f95a389
--- /dev/null
+++ b/step8_eomt_mask_baselines/eomt_anomaly_report_table.csv
@@ -0,0 +1,13 @@
+checkpoint,method,miou,RoadAnomaly21_auprc,RoadAnomaly21_fpr95,RoadAnomaly21_num_images,RoadObstacle21_auprc,RoadObstacle21_fpr95,RoadObstacle21_num_images,FS_LostFound_full_auprc,FS_LostFound_full_fpr95,FS_LostFound_full_num_images,fs_static_auprc,fs_static_fpr95,fs_static_num_images,RoadAnomaly_auprc,RoadAnomaly_fpr95,RoadAnomaly_num_images
+eomt_coco,msp,,38.71805117931566,77.52158542523586,10,2.7516464707493493,99.97984944130613,30,3.7893197100075646,97.20518225498047,99,4.659001916442816,99.48092117751997,20,18.84913965664081,91.84777799663615,60
+eomt_coco,maxlogit,,39.02551286843461,77.03345064614419,10,2.7464072718046686,99.98043563937723,30,3.7790941803294524,97.2029567095489,99,4.666524546335602,99.4622199530125,20,19.007785043618085,91.57654204838246,60
+eomt_coco,entropy,,43.08147659955639,63.545987545575834,10,8.288580376786761,99.98311016807659,30,3.7899036765901006,96.9060571918266,99,4.567454599694478,99.32666294908287,20,21.517097341625607,87.88909926174514,60
+eomt_coco,rba,,41.488960376064135,71.68304955245624,10,2.7704540952022634,99.94519571426403,30,2.823879131172171,95.17003852930995,99,4.077273696354527,93.92522810583992,20,21.304803231323415,82.39053414825683,60
+eomt_cityscapes,msp,,68.09563873063868,30.422128178431596,10,94.18765038906982,0.36658838479925593,30,16.393849875477365,13.012709509227745,99,58.1919895951857,43.46787756005877,20,71.35136317226491,15.447183143162249,60
+eomt_cityscapes,maxlogit,,67.49119536018513,31.590571692115233,10,94.22573713395946,0.3625164017696901,30,16.377438397172863,12.776338533012252,99,58.37164607207064,46.91512014774112,20,70.76849199382829,15.062925195348159,60
+eomt_cityscapes,entropy,,68.37170536621619,30.61906590499114,10,94.28295549674934,0.34885589314865306,30,18.962785205018154,12.8271359543129,99,56.9443329197621,43.803741325662024,20,74.19086462787737,14.68882380108786,60
+eomt_cityscapes,rba,,67.87210532148492,31.591587787327313,10,94.17803789166507,0.338356666982446,30,16.364439876551273,12.700015031932391,99,59.56913284052917,47.054889793512565,20,70.40796331367206,14.175485546149059,60
+eomt_finetuned,msp,,70.76821826915815,13.600856687028159,10,25.43563653062372,2.7168815100043595,30,34.33762473053539,15.927807652782612,99,70.01144998195834,13.678174965019094,20,52.176485965836974,93.71364099766765,60
+eomt_finetuned,maxlogit,,69.57143749671721,16.129666945103146,10,24.453605568870344,2.786775162159658,30,33.917335559180536,17.21182860770608,99,71.22701538617534,13.473885077852236,20,51.55369494533466,93.24204631825737,60
+eomt_finetuned,entropy,,68.83429912585008,14.98138698316053,10,24.158271028931864,2.783870341360932,30,34.87416868566116,23.823490614216396,99,71.60368061303286,12.617245811951271,20,51.52712560576027,92.8493650308477,60
+eomt_finetuned,rba,,58.43124564654474,20.511043107509472,10,18.318807018728453,3.3209952596464913,30,24.802463227076196,19.11074852640004,99,71.04355692620362,12.46134320056074,20,47.40939161259241,79.58764633295759,60
diff --git a/step8_eomt_mask_baselines/eomt_anomaly_results.csv b/step8_eomt_mask_baselines/eomt_anomaly_results.csv
new file mode 100644
index 0000000..870247d
--- /dev/null
+++ b/step8_eomt_mask_baselines/eomt_anomaly_results.csv
@@ -0,0 +1,61 @@
+model,checkpoint,dataset,method,temperature,miou,auprc,fpr95,num_images,num_ood_pixels,num_ind_pixels
+EoMT,eomt_coco,RoadAnomaly21,msp,1.0,,38.71805117931566,77.52158542523586,10,1317063,7578030
+EoMT,eomt_coco,RoadAnomaly21,maxlogit,1.0,,39.02551286843461,77.03345064614419,10,1317063,7578030
+EoMT,eomt_coco,RoadAnomaly21,entropy,1.0,,43.08147659955639,63.545987545575834,10,1317063,7578030
+EoMT,eomt_coco,RoadAnomaly21,rba,1.0,,41.488960376064135,71.68304955245624,10,1317063,7578030
+EoMT,eomt_coco,RoadObsticle21,msp,1.0,,2.7516464707493493,99.97984944130613,30,129782,19106170
+EoMT,eomt_coco,RoadObsticle21,maxlogit,1.0,,2.7464072718046686,99.98043563937723,30,129782,19106170
+EoMT,eomt_coco,RoadObsticle21,entropy,1.0,,8.288580376786761,99.98311016807659,30,129782,19106170
+EoMT,eomt_coco,RoadObsticle21,rba,1.0,,2.7704540952022634,99.94519571426403,30,129782,19106170
+EoMT,eomt_coco,RoadAnomaly,msp,1.0,,18.84913965664081,91.84777799663615,60,5446413,49849587
+EoMT,eomt_coco,RoadAnomaly,maxlogit,1.0,,19.007785043618085,91.57654204838246,60,5446413,49849587
+EoMT,eomt_coco,RoadAnomaly,entropy,1.0,,21.517097341625607,87.88909926174514,60,5446413,49849587
+EoMT,eomt_coco,RoadAnomaly,rba,1.0,,21.304803231323415,82.39053414825683,60,5446413,49849587
+EoMT,eomt_coco,fs_static,msp,1.0,,4.659001916442816,99.48092117751997,20,732524,34420206
+EoMT,eomt_coco,fs_static,maxlogit,1.0,,4.666524546335602,99.4622199530125,20,732524,34420206
+EoMT,eomt_coco,fs_static,entropy,1.0,,4.567454599694478,99.32666294908287,20,732524,34420206
+EoMT,eomt_coco,fs_static,rba,1.0,,4.077273696354527,93.92522810583992,20,732524,34420206
+EoMT,eomt_coco,FS_LostFound_full,msp,1.0,,3.7893197100075646,97.20518225498047,99,477664,168408155
+EoMT,eomt_coco,FS_LostFound_full,maxlogit,1.0,,3.7790941803294524,97.2029567095489,99,477664,168408155
+EoMT,eomt_coco,FS_LostFound_full,entropy,1.0,,3.7899036765901006,96.9060571918266,99,477664,168408155
+EoMT,eomt_coco,FS_LostFound_full,rba,1.0,,2.823879131172171,95.17003852930995,99,477664,168408155
+EoMT,eomt_cityscapes,RoadAnomaly21,msp,1.0,,68.09563873063868,30.422128178431596,10,1317063,7578030
+EoMT,eomt_cityscapes,RoadAnomaly21,maxlogit,1.0,,67.49119536018513,31.590571692115233,10,1317063,7578030
+EoMT,eomt_cityscapes,RoadAnomaly21,entropy,1.0,,68.37170536621619,30.61906590499114,10,1317063,7578030
+EoMT,eomt_cityscapes,RoadAnomaly21,rba,1.0,,67.87210532148492,31.591587787327313,10,1317063,7578030
+EoMT,eomt_cityscapes,RoadObsticle21,msp,1.0,,94.18765038906982,0.36658838479925593,30,129782,19106170
+EoMT,eomt_cityscapes,RoadObsticle21,maxlogit,1.0,,94.22573713395946,0.3625164017696901,30,129782,19106170
+EoMT,eomt_cityscapes,RoadObsticle21,entropy,1.0,,94.28295549674934,0.34885589314865306,30,129782,19106170
+EoMT,eomt_cityscapes,RoadObsticle21,rba,1.0,,94.17803789166507,0.338356666982446,30,129782,19106170
+EoMT,eomt_cityscapes,RoadAnomaly,msp,1.0,,71.35136317226491,15.447183143162249,60,5446413,49849587
+EoMT,eomt_cityscapes,RoadAnomaly,maxlogit,1.0,,70.76849199382829,15.062925195348159,60,5446413,49849587
+EoMT,eomt_cityscapes,RoadAnomaly,entropy,1.0,,74.19086462787737,14.68882380108786,60,5446413,49849587
+EoMT,eomt_cityscapes,RoadAnomaly,rba,1.0,,70.40796331367206,14.175485546149059,60,5446413,49849587
+EoMT,eomt_cityscapes,fs_static,msp,1.0,,58.1919895951857,43.46787756005877,20,732524,34420206
+EoMT,eomt_cityscapes,fs_static,maxlogit,1.0,,58.37164607207064,46.91512014774112,20,732524,34420206
+EoMT,eomt_cityscapes,fs_static,entropy,1.0,,56.9443329197621,43.803741325662024,20,732524,34420206
+EoMT,eomt_cityscapes,fs_static,rba,1.0,,59.56913284052917,47.054889793512565,20,732524,34420206
+EoMT,eomt_cityscapes,FS_LostFound_full,msp,1.0,,16.393849875477365,13.012709509227745,99,477664,168408155
+EoMT,eomt_cityscapes,FS_LostFound_full,maxlogit,1.0,,16.377438397172863,12.776338533012252,99,477664,168408155
+EoMT,eomt_cityscapes,FS_LostFound_full,entropy,1.0,,18.962785205018154,12.8271359543129,99,477664,168408155
+EoMT,eomt_cityscapes,FS_LostFound_full,rba,1.0,,16.364439876551273,12.700015031932391,99,477664,168408155
+EoMT,eomt_finetuned,RoadAnomaly21,msp,1.0,,70.76821826915815,13.600856687028159,10,1317063,7578030
+EoMT,eomt_finetuned,RoadAnomaly21,maxlogit,1.0,,69.57143749671721,16.129666945103146,10,1317063,7578030
+EoMT,eomt_finetuned,RoadAnomaly21,entropy,1.0,,68.83429912585008,14.98138698316053,10,1317063,7578030
+EoMT,eomt_finetuned,RoadAnomaly21,rba,1.0,,58.43124564654474,20.511043107509472,10,1317063,7578030
+EoMT,eomt_finetuned,RoadObsticle21,msp,1.0,,25.43563653062372,2.7168815100043595,30,129782,19106170
+EoMT,eomt_finetuned,RoadObsticle21,maxlogit,1.0,,24.453605568870344,2.786775162159658,30,129782,19106170
+EoMT,eomt_finetuned,RoadObsticle21,entropy,1.0,,24.158271028931864,2.783870341360932,30,129782,19106170
+EoMT,eomt_finetuned,RoadObsticle21,rba,1.0,,18.318807018728453,3.3209952596464913,30,129782,19106170
+EoMT,eomt_finetuned,RoadAnomaly,msp,1.0,,52.176485965836974,93.71364099766765,60,5446413,49849587
+EoMT,eomt_finetuned,RoadAnomaly,maxlogit,1.0,,51.55369494533466,93.24204631825737,60,5446413,49849587
+EoMT,eomt_finetuned,RoadAnomaly,entropy,1.0,,51.52712560576027,92.8493650308477,60,5446413,49849587
+EoMT,eomt_finetuned,RoadAnomaly,rba,1.0,,47.40939161259241,79.58764633295759,60,5446413,49849587
+EoMT,eomt_finetuned,fs_static,msp,1.0,,70.01144998195834,13.678174965019094,20,732524,34420206
+EoMT,eomt_finetuned,fs_static,maxlogit,1.0,,71.22701538617534,13.473885077852236,20,732524,34420206
+EoMT,eomt_finetuned,fs_static,entropy,1.0,,71.60368061303286,12.617245811951271,20,732524,34420206
+EoMT,eomt_finetuned,fs_static,rba,1.0,,71.04355692620362,12.46134320056074,20,732524,34420206
+EoMT,eomt_finetuned,FS_LostFound_full,msp,1.0,,34.33762473053539,15.927807652782612,99,477664,168408155
+EoMT,eomt_finetuned,FS_LostFound_full,maxlogit,1.0,,33.917335559180536,17.21182860770608,99,477664,168408155
+EoMT,eomt_finetuned,FS_LostFound_full,entropy,1.0,,34.87416868566116,23.823490614216396,99,477664,168408155
+EoMT,eomt_finetuned,FS_LostFound_full,rba,1.0,,24.802463227076196,19.11074852640004,99,477664,168408155
diff --git a/step8_eomt_mask_baselines/eomt_temperature_results.csv b/step8_eomt_mask_baselines/eomt_temperature_results.csv
new file mode 100644
index 0000000..bc09fb3
--- /dev/null
+++ b/step8_eomt_mask_baselines/eomt_temperature_results.csv
@@ -0,0 +1,61 @@
+model,checkpoint,dataset,method,temperature,miou,auprc,fpr95,num_images,num_ood_pixels,num_ind_pixels
+EoMT,eomt_coco,RoadAnomaly21,msp_t0.5,0.5,,38.457809246908035,78.23220546764792,10,1317063,7578030
+EoMT,eomt_coco,RoadAnomaly21,msp_t0.75,0.75,,38.65369360856192,77.6521470619673,10,1317063,7578030
+EoMT,eomt_coco,RoadAnomaly21,msp,1.0,,38.71805117931566,77.52158542523586,10,1317063,7578030
+EoMT,eomt_coco,RoadAnomaly21,msp_t1.1,1.1,,38.733597305619924,77.49498220513775,10,1317063,7578030
+EoMT,eomt_coco,RoadObsticle21,msp_t0.5,0.5,,2.7397144031341454,99.97952493880248,30,129782,19106170
+EoMT,eomt_coco,RoadObsticle21,msp_t0.75,0.75,,2.748358824815808,99.97977616654725,30,129782,19106170
+EoMT,eomt_coco,RoadObsticle21,msp,1.0,,2.7516464707493493,99.97984944130613,30,129782,19106170
+EoMT,eomt_coco,RoadObsticle21,msp_t1.1,1.1,,2.7524835397365157,99.97987561086288,30,129782,19106170
+EoMT,eomt_coco,RoadAnomaly,msp_t0.5,0.5,,18.78751202412698,92.01634509028129,60,5446413,49849587
+EoMT,eomt_coco,RoadAnomaly,msp_t0.75,0.75,,18.830765733849248,91.89128688267768,60,5446413,49849587
+EoMT,eomt_coco,RoadAnomaly,msp,1.0,,18.84913965664081,91.84777799663615,60,5446413,49849587
+EoMT,eomt_coco,RoadAnomaly,msp_t1.1,1.1,,18.853153731367705,91.83703768699228,60,5446413,49849587
+EoMT,eomt_coco,fs_static,msp_t0.5,0.5,,4.650297868407619,99.50114476363099,20,732524,34420206
+EoMT,eomt_coco,fs_static,msp_t0.75,0.75,,4.656531641325685,99.48592114759569,20,732524,34420206
+EoMT,eomt_coco,fs_static,msp,1.0,,4.659001916442816,99.48092117751997,20,732524,34420206
+EoMT,eomt_coco,fs_static,msp_t1.1,1.1,,4.6595207190819545,99.47975325888521,20,732524,34420206
+EoMT,eomt_coco,FS_LostFound_full,msp_t0.5,0.5,,3.780995858312887,97.21943275252912,99,477664,168408155
+EoMT,eomt_coco,FS_LostFound_full,msp_t0.75,0.75,,3.786944791596082,97.20641616197267,99,477664,168408155
+EoMT,eomt_coco,FS_LostFound_full,msp,1.0,,3.7893197100075646,97.20518225498047,99,477664,168408155
+EoMT,eomt_coco,FS_LostFound_full,msp_t1.1,1.1,,3.7899345486801717,97.2058484935008,99,477664,168408155
+EoMT,eomt_cityscapes,RoadAnomaly21,msp_t0.5,0.5,,67.94287082795915,30.416005215075685,10,1317063,7578030
+EoMT,eomt_cityscapes,RoadAnomaly21,msp_t0.75,0.75,,68.03052477901113,30.419475774046816,10,1317063,7578030
+EoMT,eomt_cityscapes,RoadAnomaly21,msp,1.0,,68.09563873063868,30.422128178431596,10,1317063,7578030
+EoMT,eomt_cityscapes,RoadAnomaly21,msp_t1.1,1.1,,68.11713492821228,30.42274839239222,10,1317063,7578030
+EoMT,eomt_cityscapes,RoadObsticle21,msp_t0.5,0.5,,94.1739371428372,0.3687133528069728,30,129782,19106170
+EoMT,eomt_cityscapes,RoadObsticle21,msp_t0.75,0.75,,94.18363437956744,0.3673839393243125,30,129782,19106170
+EoMT,eomt_cityscapes,RoadObsticle21,msp,1.0,,94.18765038906982,0.36658838479925593,30,129782,19106170
+EoMT,eomt_cityscapes,RoadObsticle21,msp_t1.1,1.1,,94.18862371408119,0.36643136745878424,30,129782,19106170
+EoMT,eomt_cityscapes,RoadAnomaly,msp_t0.5,0.5,,71.15141459722764,15.652207910970256,60,5446413,49849587
+EoMT,eomt_cityscapes,RoadAnomaly,msp_t0.75,0.75,,71.26776521365635,15.495346029647147,60,5446413,49849587
+EoMT,eomt_cityscapes,RoadAnomaly,msp,1.0,,71.35136317226491,15.447183143162249,60,5446413,49849587
+EoMT,eomt_cityscapes,RoadAnomaly,msp_t1.1,1.1,,71.38002483822842,15.436242230050972,60,5446413,49849587
+EoMT,eomt_cityscapes,fs_static,msp_t0.5,0.5,,58.16846372755191,43.45717744978052,20,732524,34420206
+EoMT,eomt_cityscapes,fs_static,msp_t0.75,0.75,,58.18494539856691,43.46418786685937,20,732524,34420206
+EoMT,eomt_cityscapes,fs_static,msp,1.0,,58.1919895951857,43.46787756005877,20,732524,34420206
+EoMT,eomt_cityscapes,fs_static,msp_t1.1,1.1,,58.19377861751567,43.48152651962629,20,732524,34420206
+EoMT,eomt_cityscapes,FS_LostFound_full,msp_t0.5,0.5,,16.381070492665785,13.019064308376278,99,477664,168408155
+EoMT,eomt_cityscapes,FS_LostFound_full,msp_t0.75,0.75,,16.388361659181815,13.014767010540554,99,477664,168408155
+EoMT,eomt_cityscapes,FS_LostFound_full,msp,1.0,,16.393849875477365,13.012709509227745,99,477664,168408155
+EoMT,eomt_cityscapes,FS_LostFound_full,msp_t1.1,1.1,,16.39583058005375,13.011993391887703,99,477664,168408155
+EoMT,eomt_finetuned,RoadAnomaly21,msp_t0.5,0.5,,70.75451481813693,13.680230877945851,10,1317063,7578030
+EoMT,eomt_finetuned,RoadAnomaly21,msp_t0.75,0.75,,70.76617364311109,13.619661046472501,10,1317063,7578030
+EoMT,eomt_finetuned,RoadAnomaly21,msp,1.0,,70.76821826915815,13.600856687028159,10,1317063,7578030
+EoMT,eomt_finetuned,RoadAnomaly21,msp_t1.1,1.1,,70.76845109363153,13.596290856594656,10,1317063,7578030
+EoMT,eomt_finetuned,RoadObsticle21,msp_t0.5,0.5,,25.427664167015863,2.715876599025341,30,129782,19106170
+EoMT,eomt_finetuned,RoadObsticle21,msp_t0.75,0.75,,25.43366349362101,2.719676418664756,30,129782,19106170
+EoMT,eomt_finetuned,RoadObsticle21,msp,1.0,,25.43563653062372,2.7168815100043595,30,129782,19106170
+EoMT,eomt_finetuned,RoadObsticle21,msp_t1.1,1.1,,25.435995267946687,2.7170385273448314,30,129782,19106170
+EoMT,eomt_finetuned,RoadAnomaly,msp_t0.5,0.5,,52.14368550580401,93.86699231831147,60,5446413,49849587
+EoMT,eomt_finetuned,RoadAnomaly,msp_t0.75,0.75,,52.16704003241111,93.75005855113704,60,5446413,49849587
+EoMT,eomt_finetuned,RoadAnomaly,msp,1.0,,52.176485965836974,93.71364099766765,60,5446413,49849587
+EoMT,eomt_finetuned,RoadAnomaly,msp_t1.1,1.1,,52.1788577649354,93.70375525879481,60,5446413,49849587
+EoMT,eomt_finetuned,fs_static,msp_t0.5,0.5,,69.93456487361965,13.873667693912118,20,732524,34420206
+EoMT,eomt_finetuned,fs_static,msp_t0.75,0.75,,69.98777248602161,13.736355906760117,20,732524,34420206
+EoMT,eomt_finetuned,fs_static,msp,1.0,,70.01144998195834,13.678174965019094,20,732524,34420206
+EoMT,eomt_finetuned,fs_static,msp_t1.1,1.1,,70.01755818701479,13.66349172924764,20,732524,34420206
+EoMT,eomt_finetuned,FS_LostFound_full,msp_t0.5,0.5,,34.23933214525758,15.688113797102046,99,477664,168408155
+EoMT,eomt_finetuned,FS_LostFound_full,msp_t0.75,0.75,,34.30795483856447,15.71877383253798,99,477664,168408155
+EoMT,eomt_finetuned,FS_LostFound_full,msp,1.0,,34.33762473053539,15.927807652782612,99,477664,168408155
+EoMT,eomt_finetuned,FS_LostFound_full,msp_t1.1,1.1,,34.345245238800985,15.984221191663789,99,477664,168408155
diff --git a/step8_eomt_mask_baselines/eomt_temperature_smoke_results.csv b/step8_eomt_mask_baselines/eomt_temperature_smoke_results.csv
new file mode 100644
index 0000000..d5b17cb
--- /dev/null
+++ b/step8_eomt_mask_baselines/eomt_temperature_smoke_results.csv
@@ -0,0 +1,4 @@
+model,checkpoint,dataset,method,temperature,miou,auprc,fpr95,num_images,num_ood_pixels,num_ind_pixels
+EoMT,eomt_coco,RoadAnomaly21,msp_t0.5,0.5,,5.85757918040927,66.40280889543419,1,15288,890884
+EoMT,eomt_coco,RoadAnomaly21,msp_t0.75,0.75,,5.8728084362174835,66.40146191872343,1,15288,890884
+EoMT,eomt_coco,RoadAnomaly21,msp_t1.1,1.1,,5.880683055117789,66.40011494201265,1,15288,890884
diff --git a/step8_eomt_mask_baselines/run_all_eomt_anomaly.sh b/step8_eomt_mask_baselines/run_all_eomt_anomaly.sh
new file mode 100755
index 0000000..bef5b11
--- /dev/null
+++ b/step8_eomt_mask_baselines/run_all_eomt_anomaly.sh
@@ -0,0 +1,49 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# Keep datasets and checkpoints out of git. Override these env vars if your
+# local paths differ.
+DATASET_ROOT="${DATASET_ROOT:-Validation_Dataset}"
+OUTPUT_CSV="${OUTPUT_CSV:-step8_eomt_mask_baselines/eomt_anomaly_results.csv}"
+DEVICE="${DEVICE:-auto}"
+PYTHON="${PYTHON:-.venv/bin/python}"
+MAX_IMAGES_ARG=()
+
+if [[ -n "${MAX_IMAGES:-}" ]]; then
+  MAX_IMAGES_ARG=(--max-images "${MAX_IMAGES}")
+fi
+
+DATASETS=(
+  RoadAnomaly21
+  RoadObsticle21
+  RoadAnomaly
+  fs_static
+  FS_LostFound_full
+)
+
+# Format per line:
+# checkpoint_name|config_path|checkpoint_path|miou
+CHECKPOINTS=(
+  "eomt_coco|eomt/configs/dinov2/coco/panoptic/eomt_base_640_2x.yaml|eomt_checkpoints/eomt_coco.bin|"
+  "eomt_cityscapes|eomt/configs/dinov2/cityscapes/semantic/eomt_base_640.yaml|eomt_checkpoints/eomt_cityscapes.bin|"
+  "eomt_finetuned|eomt/configs/dinov2/cityscapes/semantic/eomt_base_640.yaml|eomt_checkpoints/epoch=9-step=1850.ckpt|"
+)
+
+for checkpoint_spec in "${CHECKPOINTS[@]}"; do
+  IFS="|" read -r checkpoint_name config_path checkpoint_path miou <<< "${checkpoint_spec}"
+
+  for dataset in "${DATASETS[@]}"; do
+    echo "Running ${checkpoint_name} ${dataset} all methods"
+    "${PYTHON}" step8_eomt_mask_baselines/run_eomt_anomaly.py \
+      --config "${config_path}" \
+      --checkpoint "${checkpoint_path}" \
+      --checkpoint-name "${checkpoint_name}" \
+      --dataset-root "${DATASET_ROOT}" \
+      --dataset "${dataset}" \
+      --method all \
+      --miou "${miou}" \
+      --device "${DEVICE}" \
+      --output-csv "${OUTPUT_CSV}" \
+      ${MAX_IMAGES_ARG[@]+"${MAX_IMAGES_ARG[@]}"}
+  done
+done
diff --git a/step8_eomt_mask_baselines/run_eomt_anomaly.py b/step8_eomt_mask_baselines/run_eomt_anomaly.py
new file mode 100755
index 0000000..4c0cd88
--- /dev/null
+++ b/step8_eomt_mask_baselines/run_eomt_anomaly.py
@@ -0,0 +1,996 @@
+#!/usr/bin/env python3
+"""Run Step 8 EoMT mask-based anomaly segmentation baselines.
+
+This script adapts the Step 7 anomaly evaluation pattern to EoMT. It keeps
+dataset handling, binary anomaly masks, AuPRC/FPR95, and CSV output consistent
+with the ERFNet baselines while replacing ERFNet inference with EoMT mask-query
+inference.
+"""
+
+from __future__ import annotations
+
+import argparse
+import csv
+import glob
+import importlib
+import inspect
+import sys
+import warnings
+from contextlib import nullcontext
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Any, Iterable
+
+
+DEFAULT_DATASETS = [
+    "RoadAnomaly21",
+    "RoadObsticle21",
+    "RoadAnomaly",
+    "fs_static",
+    "FS_LostFound_full",
+]
+
+DEFAULT_METHODS = ["msp", "maxlogit", "entropy", "rba"]
+METHOD_CHOICES = DEFAULT_METHODS + ["all"]
+IGNORE_LABEL = 255
+
+
+@dataclass(frozen=True)
+class DatasetResult:
+    checkpoint: str
+    dataset: str
+    method: str
+    temperature: float
+    auprc: float
+    fpr95: float
+    num_images: int
+    num_ood_pixels: int
+    num_ind_pixels: int
+    miou: str | float = ""
+
+
+def parse_img_size(value: str) -> tuple[int, int]:
+    parts = value.lower().replace(",", "x").split("x")
+    if len(parts) != 2:
+        raise argparse.ArgumentTypeError("Expected image size as HxW, for example 640x640.")
+    try:
+        height, width = int(parts[0]), int(parts[1])
+    except ValueError as exc:
+        raise argparse.ArgumentTypeError("Image size must contain integers.") from exc
+    if height <= 0 or width <= 0:
+        raise argparse.ArgumentTypeError("Image size values must be positive.")
+    return height, width
+
+
+def parse_int_list(value: str | None) -> list[int] | None:
+    if value is None or value.strip() == "":
+        return None
+    return [int(item.strip()) for item in value.split(",") if item.strip()]
+
+
+def parse_float_list(value: str | None) -> list[float] | None:
+    if value is None or value.strip() == "":
+        return None
+    parsed = [float(item.strip()) for item in value.split(",") if item.strip()]
+    if any(item <= 0 for item in parsed):
+        raise argparse.ArgumentTypeError("Temperature values must be positive.")
+    return parsed
+
+
+def ensure_eomt_on_path(repo_root: Path) -> None:
+    eomt_root = repo_root / "eomt"
+    if str(eomt_root) not in sys.path:
+        sys.path.insert(0, str(eomt_root))
+
+
+def import_from_class_path(class_path: str) -> type:
+    module_name, class_name = class_path.rsplit(".", 1)
+    module = importlib.import_module(module_name)
+    return getattr(module, class_name)
+
+
+def filter_kwargs(callable_obj: Any, kwargs: dict[str, Any]) -> dict[str, Any]:
+    signature = inspect.signature(callable_obj)
+    if any(param.kind == inspect.Parameter.VAR_KEYWORD for param in signature.parameters.values()):
+        return kwargs
+    return {key: value for key, value in kwargs.items() if key in signature.parameters}
+
+
+def load_yaml(path: Path) -> dict[str, Any]:
+    try:
+        import yaml
+    except ImportError as exc:
+        raise RuntimeError("PyYAML is required to read EoMT config files.") from exc
+
+    with path.open("r", encoding="utf-8") as file:
+        return yaml.safe_load(file)
+
+
+def infer_num_classes(config: dict[str, Any], explicit_num_classes: int | None) -> int:
+    if explicit_num_classes is not None:
+        return explicit_num_classes
+
+    data_args = config.get("data", {}).get("init_args", {})
+    stuff_classes = data_args.get("stuff_classes")
+    if stuff_classes:
+        return max(stuff_classes) + 1
+
+    raise ValueError(
+        "Could not infer num_classes from the config. Pass --num-classes explicitly "
+        "(19 for Cityscapes semantic, usually 133 for COCO panoptic)."
+    )
+
+
+def infer_num_classes_from_state_dict(state_dict: dict[str, Any]) -> int | None:
+    class_head = state_dict.get("network.class_head.weight")
+    if class_head is None:
+        return None
+    return int(class_head.shape[0] - 1)
+
+
+def infer_num_q_from_state_dict(state_dict: dict[str, Any]) -> int | None:
+    query_weights = state_dict.get("network.q.weight")
+    if query_weights is None:
+        return None
+    return int(query_weights.shape[0])
+
+
+def infer_img_size_from_state_dict(state_dict: dict[str, Any]) -> tuple[int, int] | None:
+    import math
+
+    pos_embed = state_dict.get("network.encoder.backbone.pos_embed")
+    patch_embed = state_dict.get("network.encoder.backbone.patch_embed.proj.weight")
+    if pos_embed is None or patch_embed is None:
+        return None
+
+    num_patches = int(pos_embed.shape[1])
+    grid_size = int(math.sqrt(num_patches))
+    if grid_size * grid_size != num_patches:
+        return None
+
+    patch_size = int(patch_embed.shape[-1])
+    image_size = grid_size * patch_size
+    return image_size, image_size
+
+
+def select_device(device_name: str):
+    import torch
+
+    if device_name == "auto":
+        if torch.cuda.is_available():
+            return torch.device("cuda")
+        if hasattr(torch.backends, "mps") and torch.backends.mps.is_available():
+            return torch.device("mps")
+        return torch.device("cpu")
+
+    device = torch.device(device_name)
+    if device.type == "cuda" and not torch.cuda.is_available():
+        raise RuntimeError("CUDA was requested but is not available.")
+    if device.type == "mps" and not (hasattr(torch.backends, "mps") and torch.backends.mps.is_available()):
+        raise RuntimeError("MPS was requested but is not available.")
+    return device
+
+
+def load_checkpoint_state(path: Path, device):
+    import torch
+
+    try:
+        state = torch.load(path, map_location=device, weights_only=False)
+    except TypeError:
+        state = torch.load(path, map_location=device)
+    if isinstance(state, dict):
+        for key in ("state_dict", "model", "model_state_dict"):
+            if key in state and isinstance(state[key], dict):
+                state = state[key]
+                break
+    if not isinstance(state, dict):
+        raise ValueError(f"Checkpoint did not contain a state dict: {path}")
+    return state
+
+
+def build_eomt_model(
+    *,
+    repo_root: Path,
+    config_path: Path,
+    checkpoint_path: Path,
+    device_name: str,
+    img_size: tuple[int, int] | None,
+    num_classes: int | None,
+    stuff_classes: list[int] | None,
+    load_ckpt_class_head: bool,
+):
+    import torch
+
+    ensure_eomt_on_path(repo_root)
+    config = load_yaml(config_path)
+    device = select_device(device_name)
+    state_dict = load_checkpoint_state(checkpoint_path, device)
+    resolved_num_classes = (
+        num_classes
+        or infer_num_classes_from_state_dict(state_dict)
+        or infer_num_classes(config, None)
+    )
+    resolved_img_size = img_size or infer_img_size_from_state_dict(state_dict) or (640, 640)
+    checkpoint_num_q = infer_num_q_from_state_dict(state_dict)
+
+    warnings.filterwarnings(
+        "ignore",
+        message=r".*Attribute 'network' is an instance of `nn\.Module` and is already saved during checkpointing.*",
+    )
+
+    encoder_cfg = config["model"]["init_args"]["network"]["init_args"]["encoder"]
+    encoder_cls = import_from_class_path(encoder_cfg["class_path"])
+    encoder_kwargs = dict(encoder_cfg.get("init_args", {}))
+    # The local ViT wrapper uses ckpt_path only to decide whether timm should
+    # auto-download pretrained backbone weights. We load the full EoMT checkpoint
+    # below, so keep this non-null to make local smoke tests work offline.
+    encoder_kwargs.setdefault("ckpt_path", str(checkpoint_path))
+    encoder = encoder_cls(img_size=resolved_img_size, **encoder_kwargs)
+
+    network_cfg = config["model"]["init_args"]["network"]
+    network_cls = import_from_class_path(network_cfg["class_path"])
+    network_kwargs = {
+        key: value
+        for key, value in network_cfg.get("init_args", {}).items()
+        if key != "encoder"
+    }
+    network_kwargs["masked_attn_enabled"] = False
+    if checkpoint_num_q is not None:
+        network_kwargs["num_q"] = checkpoint_num_q
+    network = network_cls(
+        encoder=encoder,
+        num_classes=resolved_num_classes,
+        **filter_kwargs(network_cls, network_kwargs),
+    )
+
+    model_cfg = config["model"]
+    model_cls = import_from_class_path(model_cfg["class_path"])
+    model_kwargs = {
+        key: value
+        for key, value in model_cfg.get("init_args", {}).items()
+        if key != "network"
+    }
+
+    config_stuff_classes = config.get("data", {}).get("init_args", {}).get("stuff_classes")
+    if stuff_classes is not None:
+        model_kwargs["stuff_classes"] = stuff_classes
+    elif config_stuff_classes is not None:
+        model_kwargs["stuff_classes"] = config_stuff_classes
+
+    model_kwargs["load_ckpt_class_head"] = load_ckpt_class_head
+    model_kwargs = filter_kwargs(model_cls, model_kwargs)
+    model = model_cls(
+        network=network,
+        img_size=resolved_img_size,
+        num_classes=resolved_num_classes,
+        **model_kwargs,
+    )
+
+    incompatible = model.load_state_dict(state_dict, strict=False)
+    if incompatible.missing_keys:
+        print(f"Warning: missing checkpoint keys: {len(incompatible.missing_keys)}")
+    if incompatible.unexpected_keys:
+        print(f"Warning: unexpected checkpoint keys: {len(incompatible.unexpected_keys)}")
+
+    return model.eval().to(device), device
+
+
+def dataset_image_paths(dataset_root: Path, dataset: str, input_glob: str | None) -> list[Path]:
+    if input_glob:
+        return sorted(Path(path) for path in glob.glob(input_glob) if Path(path).is_file())
+
+    image_dir = dataset_root / dataset / "images"
+    if not image_dir.exists():
+        raise FileNotFoundError(f"Missing image directory: {image_dir}")
+
+    allowed_suffixes = {".png", ".jpg", ".jpeg", ".webp"}
+    return sorted(path for path in image_dir.iterdir() if path.suffix.lower() in allowed_suffixes)
+
+
+def infer_ground_truth_path(image_path: Path) -> Path:
+    path = str(image_path).replace("images", "labels_masks")
+    if "RoadObsticle21" in path:
+        path = path.replace(".webp", ".png")
+    if "fs_static" in path:
+        path = path.replace(".jpg", ".png").replace(".jpeg", ".png")
+    if "RoadAnomaly" in path:
+        path = path.replace(".jpg", ".png").replace(".jpeg", ".png")
+    return Path(path)
+
+
+def load_rgb_image(path: Path):
+    import numpy as np
+    import torch
+    from PIL import Image
+
+    image = Image.open(path).convert("RGB")
+    array = np.array(image, dtype=np.uint8)
+    return torch.from_numpy(array).permute(2, 0, 1)
+
+
+def resize_mask_if_needed(mask, shape: tuple[int, int]):
+    import numpy as np
+    from PIL import Image
+
+    if mask.shape == shape:
+        return mask
+
+    pil_mask = Image.fromarray(mask.astype(np.uint8))
+    resized = pil_mask.resize((shape[1], shape[0]), Image.NEAREST)
+    return np.array(resized)
+
+
+def prepare_ground_truth_mask(mask_path: Path, target_shape: tuple[int, int] | None = None):
+    import numpy as np
+    from PIL import Image
+
+    mask = np.array(Image.open(mask_path))
+    if mask.ndim == 3:
+        mask = mask[..., 0]
+    if target_shape is not None:
+        mask = resize_mask_if_needed(mask, target_shape)
+
+    path_text = str(mask_path)
+    if "RoadAnomaly" in path_text:
+        mask = np.where(mask == 2, 1, mask)
+
+    if "FS_LostFound_full" in path_text:
+        return mask
+
+    if "LostAndFound" in path_text:
+        mask = np.where(mask == 0, IGNORE_LABEL, mask)
+        mask = np.where(mask == 1, 0, mask)
+        mask = np.where((mask > 1) & (mask < 201), 1, mask)
+
+    if "Streethazard" in path_text:
+        mask = np.where(mask == 14, IGNORE_LABEL, mask)
+        mask = np.where(mask < 20, 0, mask)
+        mask = np.where(mask == IGNORE_LABEL, 1, mask)
+
+    return mask
+
+
+def fpr_at_95_tpr(scores, labels) -> float:
+    import numpy as np
+    from sklearn.metrics import roc_curve
+
+    fpr, tpr, _ = roc_curve(labels, scores)
+    if np.max(tpr) < 0.95:
+        return 1.0
+    return float(fpr[np.argmax(tpr >= 0.95)])
+
+
+def compute_anomaly_score(pixel_logits, method: str, temperature: float):
+    import torch
+
+    if temperature <= 0:
+        raise ValueError("Temperature must be positive.")
+
+    if pixel_logits.dim() != 3:
+        raise ValueError(f"Expected pixel logits with shape [C,H,W], got {tuple(pixel_logits.shape)}")
+
+    if method == "msp":
+        probs = torch.softmax(pixel_logits / temperature, dim=0)
+        return 1.0 - probs.max(dim=0).values
+
+    if method == "maxlogit":
+        return -pixel_logits.max(dim=0).values
+
+    if method == "entropy":
+        probs = torch.softmax(pixel_logits / temperature, dim=0)
+        return -(probs * torch.log(probs + 1e-12)).sum(dim=0)
+
+    raise ValueError(f"Unsupported pixel-logit anomaly method: {method}")
+
+
+def compute_rba_crop_scores(mask_logits, class_logits):
+    import torch
+
+    mask_probs = mask_logits.sigmoid()
+    class_probs = class_logits.softmax(dim=-1)[..., :-1]
+    known_scores = torch.einsum("bqhw,bqc->bchw", mask_probs, class_probs).clamp(0.0, 1.0)
+    return torch.exp(torch.log1p(-known_scores.clamp(max=1.0 - 1e-6)).sum(dim=1, keepdim=True))
+
+
+def infer_eomt_outputs(model, image, method: str, device, temperature: float):
+    import torch
+    import torch.nn.functional as functional
+
+    autocast_context = (
+        torch.amp.autocast(device_type="cuda", enabled=True)
+        if device.type == "cuda"
+        else nullcontext()
+    )
+
+    with torch.no_grad(), autocast_context:
+        imgs = [image.to(device)]
+        img_sizes = [image.shape[-2:] for image in imgs]
+        crops, origins = model.window_imgs_semantic(imgs)
+        mask_logits_per_layer, class_logits_per_layer = model(crops)
+        mask_logits = functional.interpolate(
+            mask_logits_per_layer[-1],
+            model.img_size,
+            mode="bilinear",
+        )
+        class_logits = class_logits_per_layer[-1]
+
+        if method == "rba":
+            crop_scores = compute_rba_crop_scores(mask_logits, class_logits)
+            scores = model.revert_window_logits_semantic(crop_scores, origins, img_sizes)[0][0]
+            return scores.float().cpu(), None
+
+        crop_logits = model.to_per_pixel_logits_semantic(mask_logits, class_logits)
+        logits = model.revert_window_logits_semantic(crop_logits, origins, img_sizes)[0]
+        scores = compute_anomaly_score(logits, method, temperature)
+        return scores.float().cpu(), logits.float().cpu()
+
+
+def infer_eomt_all_scores(model, image, device, temperature: float):
+    import torch
+    import torch.nn.functional as functional
+
+    autocast_context = (
+        torch.amp.autocast(device_type="cuda", enabled=True)
+        if device.type == "cuda"
+        else nullcontext()
+    )
+
+    with torch.no_grad(), autocast_context:
+        imgs = [image.to(device)]
+        img_sizes = [image.shape[-2:] for image in imgs]
+        crops, origins = model.window_imgs_semantic(imgs)
+        mask_logits_per_layer, class_logits_per_layer = model(crops)
+        mask_logits = functional.interpolate(
+            mask_logits_per_layer[-1],
+            model.img_size,
+            mode="bilinear",
+        )
+        class_logits = class_logits_per_layer[-1]
+
+        crop_logits = model.to_per_pixel_logits_semantic(mask_logits, class_logits)
+        logits = model.revert_window_logits_semantic(crop_logits, origins, img_sizes)[0]
+
+        rba_crop_scores = compute_rba_crop_scores(mask_logits, class_logits)
+        rba_scores = model.revert_window_logits_semantic(
+            rba_crop_scores,
+            origins,
+            img_sizes,
+        )[0][0]
+
+        return {
+            "msp": compute_anomaly_score(logits, "msp", temperature).float().cpu(),
+            "maxlogit": compute_anomaly_score(logits, "maxlogit", temperature).float().cpu(),
+            "entropy": compute_anomaly_score(logits, "entropy", temperature).float().cpu(),
+            "rba": rba_scores.float().cpu(),
+        }, logits.float().cpu()
+
+
+def save_logits(path: Path, image_path: Path, logits) -> None:
+    if logits is None:
+        return
+    import numpy as np
+
+    path.mkdir(parents=True, exist_ok=True)
+    safe_name = image_path.stem.replace(" ", "_")
+    np.savez_compressed(path / f"{safe_name}_pixel_logits.npz", pixel_logits=logits.numpy())
+
+
+def collect_scores_for_dataset(
+    *,
+    model,
+    device,
+    image_paths: Iterable[Path],
+    method: str,
+    temperature: float,
+    max_images: int | None,
+    logits_dir: Path | None,
+):
+    import numpy as np
+
+    score_parts = []
+    label_parts = []
+    processed = 0
+    num_ood_pixels = 0
+    num_ind_pixels = 0
+
+    for image_path in image_paths:
+        if max_images is not None and processed >= max_images:
+            break
+
+        gt_path = infer_ground_truth_path(image_path)
+        if not gt_path.exists():
+            print(f"Warning: missing ground-truth mask, skipping: {gt_path}")
+            continue
+
+        image = load_rgb_image(image_path)
+        scores, logits = infer_eomt_outputs(model, image, method, device, temperature)
+        score_array = scores.numpy()
+        gt = prepare_ground_truth_mask(gt_path, target_shape=score_array.shape)
+
+        if 1 not in np.unique(gt):
+            continue
+
+        ind_mask = gt == 0
+        ood_mask = gt == 1
+        ind_scores = score_array[ind_mask]
+        ood_scores = score_array[ood_mask]
+
+        if ind_scores.size == 0 or ood_scores.size == 0:
+            continue
+
+        score_parts.extend([ind_scores, ood_scores])
+        label_parts.extend(
+            [
+                np.zeros(ind_scores.shape[0], dtype=np.uint8),
+                np.ones(ood_scores.shape[0], dtype=np.uint8),
+            ]
+        )
+        num_ind_pixels += int(ind_scores.shape[0])
+        num_ood_pixels += int(ood_scores.shape[0])
+        processed += 1
+
+        if logits_dir is not None:
+            save_logits(logits_dir, image_path, logits)
+
+        print(f"[{processed}] processed {image_path.name}")
+
+    if not score_parts:
+        raise RuntimeError("No valid images with anomaly and in-distribution pixels were processed.")
+
+    return (
+        np.concatenate(score_parts),
+        np.concatenate(label_parts),
+        processed,
+        num_ood_pixels,
+        num_ind_pixels,
+    )
+
+
+def collect_all_scores_for_dataset(
+    *,
+    model,
+    device,
+    image_paths: Iterable[Path],
+    temperature: float,
+    max_images: int | None,
+    logits_dir: Path | None,
+):
+    import numpy as np
+
+    score_parts = {method: [] for method in DEFAULT_METHODS}
+    label_parts = []
+    processed = 0
+    num_ood_pixels = 0
+    num_ind_pixels = 0
+
+    for image_path in image_paths:
+        if max_images is not None and processed >= max_images:
+            break
+
+        gt_path = infer_ground_truth_path(image_path)
+        if not gt_path.exists():
+            print(f"Warning: missing ground-truth mask, skipping: {gt_path}")
+            continue
+
+        image = load_rgb_image(image_path)
+        scores_by_method, logits = infer_eomt_all_scores(model, image, device, temperature)
+        reference_shape = scores_by_method["maxlogit"].shape
+        gt = prepare_ground_truth_mask(gt_path, target_shape=reference_shape)
+
+        if 1 not in np.unique(gt):
+            continue
+
+        ind_mask = gt == 0
+        ood_mask = gt == 1
+        if not np.any(ind_mask) or not np.any(ood_mask):
+            continue
+
+        for method, scores in scores_by_method.items():
+            score_array = scores.numpy()
+            score_parts[method].extend([score_array[ind_mask], score_array[ood_mask]])
+
+        ind_count = int(np.count_nonzero(ind_mask))
+        ood_count = int(np.count_nonzero(ood_mask))
+        label_parts.extend(
+            [
+                np.zeros(ind_count, dtype=np.uint8),
+                np.ones(ood_count, dtype=np.uint8),
+            ]
+        )
+        num_ind_pixels += ind_count
+        num_ood_pixels += ood_count
+        processed += 1
+
+        if logits_dir is not None:
+            save_logits(logits_dir, image_path, logits)
+
+        print(f"[{processed}] processed {image_path.name}")
+
+    if not label_parts:
+        raise RuntimeError("No valid images with anomaly and in-distribution pixels were processed.")
+
+    labels = np.concatenate(label_parts)
+    return (
+        {method: np.concatenate(parts) for method, parts in score_parts.items()},
+        labels,
+        processed,
+        num_ood_pixels,
+        num_ind_pixels,
+    )
+
+
+def evaluate_dataset(
+    *,
+    model,
+    device,
+    dataset_root: Path,
+    dataset: str,
+    input_glob: str | None,
+    method: str,
+    temperature: float,
+    checkpoint_name: str,
+    miou: str | float,
+    max_images: int | None,
+    logits_dir: Path | None,
+) -> DatasetResult:
+    from sklearn.metrics import average_precision_score
+
+    image_paths = dataset_image_paths(dataset_root, dataset, input_glob)
+    if not image_paths:
+        raise FileNotFoundError(f"No images found for dataset {dataset}.")
+
+    scores, labels, processed, num_ood, num_ind = collect_scores_for_dataset(
+        model=model,
+        device=device,
+        image_paths=image_paths,
+        method=method,
+        temperature=temperature,
+        max_images=max_images,
+        logits_dir=logits_dir,
+    )
+
+    auprc = float(average_precision_score(labels, scores) * 100.0)
+    fpr95 = float(fpr_at_95_tpr(scores, labels) * 100.0)
+    method_label = method
+    if method in {"msp", "entropy"} and temperature != 1.0:
+        method_label = f"{method}_t{temperature:g}"
+
+    return DatasetResult(
+        checkpoint=checkpoint_name,
+        dataset=dataset,
+        method=method_label,
+        temperature=temperature,
+        auprc=auprc,
+        fpr95=fpr95,
+        num_images=processed,
+        num_ood_pixels=num_ood,
+        num_ind_pixels=num_ind,
+        miou=miou,
+    )
+
+
+def evaluate_dataset_all_methods(
+    *,
+    model,
+    device,
+    dataset_root: Path,
+    dataset: str,
+    input_glob: str | None,
+    temperature: float,
+    checkpoint_name: str,
+    miou: str | float,
+    max_images: int | None,
+    logits_dir: Path | None,
+) -> list[DatasetResult]:
+    from sklearn.metrics import average_precision_score
+
+    image_paths = dataset_image_paths(dataset_root, dataset, input_glob)
+    if not image_paths:
+        raise FileNotFoundError(f"No images found for dataset {dataset}.")
+
+    scores_by_method, labels, processed, num_ood, num_ind = collect_all_scores_for_dataset(
+        model=model,
+        device=device,
+        image_paths=image_paths,
+        temperature=temperature,
+        max_images=max_images,
+        logits_dir=logits_dir,
+    )
+
+    results = []
+    for method in DEFAULT_METHODS:
+        scores = scores_by_method[method]
+        method_label = method
+        if method in {"msp", "entropy"} and temperature != 1.0:
+            method_label = f"{method}_t{temperature:g}"
+        results.append(
+            DatasetResult(
+                checkpoint=checkpoint_name,
+                dataset=dataset,
+                method=method_label,
+                temperature=temperature,
+                auprc=float(average_precision_score(labels, scores) * 100.0),
+                fpr95=float(fpr_at_95_tpr(scores, labels) * 100.0),
+                num_images=processed,
+                num_ood_pixels=num_ood,
+                num_ind_pixels=num_ind,
+                miou=miou,
+            )
+        )
+
+    return results
+
+
+def collect_temperature_scores_for_dataset(
+    *,
+    model,
+    device,
+    image_paths: Iterable[Path],
+    method: str,
+    temperatures: list[float],
+    max_images: int | None,
+    logits_dir: Path | None,
+):
+    import numpy as np
+
+    if method not in {"msp", "entropy"}:
+        raise ValueError("--temperatures is only supported for MSP or entropy scores.")
+
+    score_parts = {temperature: [] for temperature in temperatures}
+    label_parts = []
+    processed = 0
+    num_ood_pixels = 0
+    num_ind_pixels = 0
+
+    for image_path in image_paths:
+        if max_images is not None and processed >= max_images:
+            break
+
+        gt_path = infer_ground_truth_path(image_path)
+        if not gt_path.exists():
+            print(f"Warning: missing ground-truth mask, skipping: {gt_path}")
+            continue
+
+        image = load_rgb_image(image_path)
+        _, logits = infer_eomt_outputs(model, image, method, device, temperature=1.0)
+        if logits is None:
+            raise RuntimeError("Temperature scaling requires saved pixel logits.")
+
+        reference_shape = logits.shape[-2:]
+        gt = prepare_ground_truth_mask(gt_path, target_shape=reference_shape)
+
+        if 1 not in np.unique(gt):
+            continue
+
+        ind_mask = gt == 0
+        ood_mask = gt == 1
+        if not np.any(ind_mask) or not np.any(ood_mask):
+            continue
+
+        for temperature in temperatures:
+            score_array = compute_anomaly_score(logits, method, temperature).numpy()
+            score_parts[temperature].extend([score_array[ind_mask], score_array[ood_mask]])
+
+        ind_count = int(np.count_nonzero(ind_mask))
+        ood_count = int(np.count_nonzero(ood_mask))
+        label_parts.extend(
+            [
+                np.zeros(ind_count, dtype=np.uint8),
+                np.ones(ood_count, dtype=np.uint8),
+            ]
+        )
+        num_ind_pixels += ind_count
+        num_ood_pixels += ood_count
+        processed += 1
+
+        if logits_dir is not None:
+            save_logits(logits_dir, image_path, logits)
+
+        print(f"[{processed}] processed {image_path.name}")
+
+    if not label_parts:
+        raise RuntimeError("No valid images with anomaly and in-distribution pixels were processed.")
+
+    labels = np.concatenate(label_parts)
+    return (
+        {temperature: np.concatenate(parts) for temperature, parts in score_parts.items()},
+        labels,
+        processed,
+        num_ood_pixels,
+        num_ind_pixels,
+    )
+
+
+def evaluate_dataset_temperatures(
+    *,
+    model,
+    device,
+    dataset_root: Path,
+    dataset: str,
+    input_glob: str | None,
+    method: str,
+    temperatures: list[float],
+    checkpoint_name: str,
+    miou: str | float,
+    max_images: int | None,
+    logits_dir: Path | None,
+) -> list[DatasetResult]:
+    from sklearn.metrics import average_precision_score
+
+    image_paths = dataset_image_paths(dataset_root, dataset, input_glob)
+    if not image_paths:
+        raise FileNotFoundError(f"No images found for dataset {dataset}.")
+
+    scores_by_temperature, labels, processed, num_ood, num_ind = collect_temperature_scores_for_dataset(
+        model=model,
+        device=device,
+        image_paths=image_paths,
+        method=method,
+        temperatures=temperatures,
+        max_images=max_images,
+        logits_dir=logits_dir,
+    )
+
+    results = []
+    for temperature in temperatures:
+        scores = scores_by_temperature[temperature]
+        method_label = method if temperature == 1.0 else f"{method}_t{temperature:g}"
+        results.append(
+            DatasetResult(
+                checkpoint=checkpoint_name,
+                dataset=dataset,
+                method=method_label,
+                temperature=temperature,
+                auprc=float(average_precision_score(labels, scores) * 100.0),
+                fpr95=float(fpr_at_95_tpr(scores, labels) * 100.0),
+                num_images=processed,
+                num_ood_pixels=num_ood,
+                num_ind_pixels=num_ind,
+                miou=miou,
+            )
+        )
+
+    return results
+
+
+def append_result(path: Path, result: DatasetResult) -> None:
+    path.parent.mkdir(parents=True, exist_ok=True)
+    fieldnames = [
+        "model",
+        "checkpoint",
+        "dataset",
+        "method",
+        "temperature",
+        "miou",
+        "auprc",
+        "fpr95",
+        "num_images",
+        "num_ood_pixels",
+        "num_ind_pixels",
+    ]
+    write_header = not path.exists()
+    with path.open("a", newline="", encoding="utf-8") as file:
+        writer = csv.DictWriter(file, fieldnames=fieldnames)
+        if write_header:
+            writer.writeheader()
+        writer.writerow(
+            {
+                "model": "EoMT",
+                "checkpoint": result.checkpoint,
+                "dataset": result.dataset,
+                "method": result.method,
+                "temperature": result.temperature,
+                "miou": result.miou,
+                "auprc": result.auprc,
+                "fpr95": result.fpr95,
+                "num_images": result.num_images,
+                "num_ood_pixels": result.num_ood_pixels,
+                "num_ind_pixels": result.num_ind_pixels,
+            }
+        )
+
+
+def parse_args() -> argparse.Namespace:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument("--repo-root", default=Path(__file__).resolve().parents[1], type=Path)
+    parser.add_argument("--config", required=True, type=Path, help="EoMT YAML config path.")
+    parser.add_argument("--checkpoint", required=True, type=Path, help="Local EoMT checkpoint path.")
+    parser.add_argument("--checkpoint-name", required=True, help="Short checkpoint label for CSV rows.")
+    parser.add_argument("--dataset-root", required=True, type=Path, help="Root containing anomaly dataset folders.")
+    parser.add_argument("--dataset", required=True, choices=DEFAULT_DATASETS)
+    parser.add_argument("--input-glob", help="Optional explicit image glob. Overrides --dataset-root/--dataset discovery.")
+    parser.add_argument("--method", required=True, choices=METHOD_CHOICES)
+    parser.add_argument("--temperature", type=float, default=1.0)
+    parser.add_argument(
+        "--temperatures",
+        type=parse_float_list,
+        help="Comma-separated MSP/entropy temperatures to evaluate from one forward pass, for example 0.5,0.75,1.0,1.1.",
+    )
+    parser.add_argument("--miou", default="", help="Optional mIoU value to repeat in the output row.")
+    parser.add_argument("--output-csv", default=Path("step8_eomt_mask_baselines/eomt_anomaly_results.csv"), type=Path)
+    parser.add_argument("--img-size", type=parse_img_size, help="EoMT inference image size as HxW. Defaults to checkpoint shape.")
+    parser.add_argument("--num-classes", type=int, help="Number of model classes. Defaults to checkpoint class head if present.")
+    parser.add_argument("--stuff-classes", help="Comma-separated panoptic stuff classes if not present in config.")
+    parser.add_argument("--device", default="auto", help="auto, cpu, cuda, cuda:0, or mps.")
+    parser.add_argument("--max-images", type=int, help="Limit images for smoke tests.")
+    parser.add_argument("--save-logits", action="store_true", help="Save per-image pixel logits for non-RbA methods.")
+    parser.add_argument("--logits-dir", type=Path, default=Path("step8_eomt_mask_baselines/saved_logits"))
+    parser.add_argument(
+        "--load-ckpt-class-head",
+        action=argparse.BooleanOptionalAction,
+        default=True,
+        help="Whether to load the checkpoint class head when constructing compatible EoMT modules.",
+    )
+    return parser.parse_args()
+
+
+def main() -> None:
+    args = parse_args()
+    model, device = build_eomt_model(
+        repo_root=args.repo_root,
+        config_path=args.config,
+        checkpoint_path=args.checkpoint,
+        device_name=args.device,
+        img_size=args.img_size,
+        num_classes=args.num_classes,
+        stuff_classes=parse_int_list(args.stuff_classes),
+        load_ckpt_class_head=args.load_ckpt_class_head,
+    )
+
+    logits_dir = args.logits_dir if args.save_logits else None
+    if args.temperatures is not None:
+        if args.method == "all":
+            raise ValueError("--temperatures cannot be combined with --method all.")
+        results = evaluate_dataset_temperatures(
+            model=model,
+            device=device,
+            dataset_root=args.dataset_root,
+            dataset=args.dataset,
+            input_glob=args.input_glob,
+            method=args.method,
+            temperatures=args.temperatures,
+            checkpoint_name=args.checkpoint_name,
+            miou=args.miou,
+            max_images=args.max_images,
+            logits_dir=logits_dir,
+        )
+    elif args.method == "all":
+        results = evaluate_dataset_all_methods(
+            model=model,
+            device=device,
+            dataset_root=args.dataset_root,
+            dataset=args.dataset,
+            input_glob=args.input_glob,
+            temperature=args.temperature,
+            checkpoint_name=args.checkpoint_name,
+            miou=args.miou,
+            max_images=args.max_images,
+            logits_dir=logits_dir,
+        )
+    else:
+        results = [
+            evaluate_dataset(
+                model=model,
+                device=device,
+                dataset_root=args.dataset_root,
+                dataset=args.dataset,
+                input_glob=args.input_glob,
+                method=args.method,
+                temperature=args.temperature,
+                checkpoint_name=args.checkpoint_name,
+                miou=args.miou,
+                max_images=args.max_images,
+                logits_dir=logits_dir,
+            )
+        ]
+
+    for result in results:
+        append_result(args.output_csv, result)
+        print(
+            f"Saved {result.checkpoint} {result.dataset} {result.method}: "
+            f"AuPRC={result.auprc:.4f}, FPR95={result.fpr95:.4f}"
+        )
+
+
+if __name__ == "__main__":
+    main()
diff --git a/step8_eomt_mask_baselines/run_temperature_sweep.sh b/step8_eomt_mask_baselines/run_temperature_sweep.sh
new file mode 100755
index 0000000..3630e56
--- /dev/null
+++ b/step8_eomt_mask_baselines/run_temperature_sweep.sh
@@ -0,0 +1,50 @@
+#!/usr/bin/env bash
+set -euo pipefail
+
+# MSP temperature sweep for report-ready temperature scaling rows.
+DATASET_ROOT="${DATASET_ROOT:-Validation_Dataset}"
+OUTPUT_CSV="${OUTPUT_CSV:-step8_eomt_mask_baselines/eomt_temperature_results.csv}"
+DEVICE="${DEVICE:-auto}"
+PYTHON="${PYTHON:-.venv/bin/python}"
+TEMPERATURES="${TEMPERATURES:-0.5,0.75,1.0,1.1}"
+
+DATASETS=(
+  RoadAnomaly21
+  RoadObsticle21
+  RoadAnomaly
+  fs_static
+  FS_LostFound_full
+)
+
+# Format per line:
+# checkpoint_name|config_path|checkpoint_path|miou
+CHECKPOINTS=(
+  "eomt_coco|eomt/configs/dinov2/coco/panoptic/eomt_base_640_2x.yaml|eomt_checkpoints/eomt_coco.bin|"
+  "eomt_cityscapes|eomt/configs/dinov2/cityscapes/semantic/eomt_base_640.yaml|eomt_checkpoints/eomt_cityscapes.bin|"
+  "eomt_finetuned|eomt/configs/dinov2/cityscapes/semantic/eomt_base_640.yaml|eomt_checkpoints/epoch=9-step=1850.ckpt|"
+)
+
+for checkpoint_spec in "${CHECKPOINTS[@]}"; do
+  IFS="|" read -r checkpoint_name config_path checkpoint_path miou <<< "${checkpoint_spec}"
+
+  for dataset in "${DATASETS[@]}"; do
+    echo "Running ${checkpoint_name} ${dataset} MSP temperatures=${TEMPERATURES}"
+    run_args=(
+      "${PYTHON}" step8_eomt_mask_baselines/run_eomt_anomaly.py
+      --config "${config_path}"
+      --checkpoint "${checkpoint_path}"
+      --checkpoint-name "${checkpoint_name}"
+      --dataset-root "${DATASET_ROOT}"
+      --dataset "${dataset}"
+      --method msp
+      --temperatures "${TEMPERATURES}"
+      --miou "${miou}"
+      --device "${DEVICE}"
+      --output-csv "${OUTPUT_CSV}"
+    )
+    if [[ -n "${MAX_IMAGES:-}" ]]; then
+      run_args+=(--max-images "${MAX_IMAGES}")
+    fi
+    "${run_args[@]}"
+  done
+done
diff --git a/step8_eomt_mask_baselines/test.ipynb b/step8_eomt_mask_baselines/test.ipynb
new file mode 100644
index 0000000..e69de29