Add Visium breast-cancer tutorial#2
Merged
Merged
Conversation
Synthesises the example progression curated by @asarigun in scverse/spatialdata-plot#590 (H&E + spots, gene-expression overlay, outline styling) into a single end-to-end tutorial. Inlines the load_visium_breast_cancer helper so the notebook is self-contained when downloaded. Uses scanpy.datasets.visium_sge for the data; pooch caches the ~100MB download across runs. Pre-executed and committed with outputs.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Lint: ruff-format was collapsing the fluent .pl chains onto single lines because they fit under 120 chars. Exclude *.ipynb from ruff-format; lint notebooks via nbqa-ruff only. Execute drift: pooch's tqdm.notebook spawns Jupyter widgets whose UUIDs regenerate on every execution, breaking the diff-against-committed check. Add scripts/strip_widget_metadata.py and run it as both a pre-commit hook and a post-execute step in execute.yaml so committed and re-executed notebooks stay symmetric. pre-commit.ci: add the ci: config block so PRs are auto-fixed by the bot (monthly autoupdate, autofix on every PR). Requires installing the pre-commit.ci GitHub App on the repo (one-time, repo settings -> Apps). Preview URL: the github.io URL 301-redirected through the scverse-org-wide scverse.org CNAME, which made the displayed link confusingly different from the destination. Switch the comment body to the canonical https://scverse.org/<repo>/pr-N/gallery.html URL — same content, no redirect.
The diff-against-committed step caught noise rather than regressions: matplotlib PNG bytes vary across machines (font hinting, dpi rendering), cell execution timestamps always differ, Python micro-version leaks into metadata.language_info. Two valid notebook executions on different machines never produce identical bytes, so the check failed on every schedule and most PRs. The remaining test is the right one: `nbconvert --execute` raises on any cell error, which fails the job. Catches lib API breakage, syntax errors, and missing data — i.e., what we actually care about. Drops nbdime from CI deps (no longer used). Widget-stripping stays in the pre-commit hook so committed notebooks remain clean.
Visium tutorial: rewrite to use squidpy.datasets.visium_hne_sdata which returns a ready SpatialData object. Removes ~50 lines of manual Image2DModel/ShapesModel/TableModel plumbing and the dependency on scanpy.datasets.visium_sge for downloads. Renames the file to visium_mouse_brain.ipynb to reflect what the dataset actually is (the squidpy fixture is mouse brain, not breast cancer). Updates color demos to mouse genes (Mbp for white-matter pattern) and the dataset's existing cluster annotations. Both tutorials: replace the "Where to next" markdown sections with a "For reproducibility" cell using watermark to print version info for the relevant scientific stack. Adds watermark to the exec extras. Strip em-dashes from all prose for consistency with the project's voice.
… 6.0) nbqa 1.9.0 referenced ast.Str which was removed in Python 3.12 — pre-commit.ci runs on 3.12 and crashed on every nbqa-ruff invocation. 1.9.1 fixes that. Also rename id: ruff -> ruff-check (ruff-pre-commit deprecated the legacy alias in v0.10).
execute.yaml:
- Drop fetch-depth: 0 (no longer needed; tj-actions/changed-files
fetches the right base internally).
- Add pip cache via setup-python (saves >1 min per PR run on the
squidpy + scanpy install).
- Stable pooch cache key (was hashFiles('**/*.ipynb') which invalidated
the dataset cache on every notebook edit; bump v* suffix when adding a
new dataset URL).
- Replace hand-rolled `git diff | grep` notebook detection with
tj-actions/changed-files. Cuts ~15 lines of brittle bash and handles
rename/move events correctly.
- Drop the trailing "we do NOT diff" comment block; the rationale lives
in commit history, not in the YAML.
preview.yaml: drop fetch-depth: 0 on the lib clone (only HEAD is built).
strip_widget_metadata.py: docstring referenced the removed diff check
in execute.yaml; rewrite to the actual remaining justification (small
committed notebooks, clean PR diffs).
.pre-commit-config.yaml: drop redundant `require_serial: false`
(default value).
|
📖 Docs preview: https://scverse.org/spatialdata-plot-notebooks/pr-2/gallery.html Built from 8e79eaa; redeployed on every push. |
Tutorials are kept for entry-point material that teaches the API on synthetic data; examples/ hosts notebooks that demonstrate the API on real datasets you'd actually analyse. Visium fits the latter.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
tutorials/visium_breast_cancer.ipynb— the gallery's first real-data tutorial.Synthesises the example progression curated by @asarigun in scverse/spatialdata-plot#590 into one coherent end-to-end notebook: load the 10x Visium Block A Section 1 dataset, render the H&E tissue, render the spots, overlay them, color by gene expression (ERBB2) and by categorical metadata (in_tissue), and finish with a publication-styled figure with thin white outlines.