Skip to content

[codex] Bundle cookbook example data and remove yfinance#729

Open
whenpoem wants to merge 2 commits intoPyPortfolio:mainfrom
whenpoem:codex-bundle-cookbook-data-remove-yfinance
Open

[codex] Bundle cookbook example data and remove yfinance#729
whenpoem wants to merge 2 commits intoPyPortfolio:mainfrom
whenpoem:codex-bundle-cookbook-data-remove-yfinance

Conversation

@whenpoem
Copy link
Copy Markdown

Summary

  • add a packaged pypfopt.data module with loaders for bundled example prices, SPY prices, and Black-Litterman market caps
  • update the 5 cookbook notebooks to load bundled example data instead of downloading with yfinance
  • remove yfinance from the dev extra and add tests for the new data loaders

Why

Issue #716 tracks the cookbook notebooks' dependency on live yfinance downloads. That makes notebook CI depend on external network calls and changing upstream data. Bundling the example datasets keeps the notebooks reproducible and allows nbmake to run offline.

Impact

  • cookbook notebooks now run against stable packaged example data
  • package distributions include the example CSV and JSON assets used by the notebooks
  • developers no longer need yfinance installed to run the notebook test suite

Validation

  • uv run --python .venv\Scripts\python.exe pytest ./tests
  • uv run --python .venv\Scripts\python.exe pytest --reruns 3 --nbmake --nbmake-timeout=3600 -vv cookbook
  • uv run --python .venv\Scripts\python.exe python -m build --sdist --wheel --outdir dist-check

@whenpoem whenpoem marked this pull request as ready for review April 27, 2026 03:15
Copy link
Copy Markdown

@atharvajoshi01 atharvajoshi01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to bundle the data, same thing sklearn does with load_iris etc. Notebooks should just work now without needing a live yfinance connection. One thing I'd double check is whether the bundled CSVs match what the original yfinance calls were returning, just so the notebook outputs stay consistent.

@whenpoem
Copy link
Copy Markdown
Author

whenpoem commented Apr 27, 2026

Makes sense to bundle the data, same thing sklearn does with load_iris etc. Notebooks should just work now without needing a live yfinance connection. One thing I'd double check is whether the bundled CSVs match what the original yfinance calls were returning, just so the notebook outputs stay consistent.

Thanks for the careful check.the first version bundled data that made the notebooks run offline, but it did not fully preserve the original yfinance ticker universes and outputs.
so I’ve pushed a follow-up commit that:
restores the original notebook ticker lists, constraints, and Black-Litterman views;
adds bundled cookbook price data matching the original yfinance download calls;
updates the SPY data and market-cap snapshot used by the Black-Litterman notebook;
adds tests to ensure the bundled data covers all tickers required by the notebooks.
I also re-ran the notebook suite and package checks:
pytest --nbmake --nbmake-timeout=3600 -q cookbook: 5 passed
pytest tests -q: 316 passed, 5 skipped
python -m build --sdist --wheel: successful
I additionally spot-checked the new bundled CSVs against fresh yfinance downloads with the same ticker/date contract, allowing only tiny floating-point/CSV roundtrip differences.(the maximum relative difference was around 2e-6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants