Don't block PRs on linkcheck by konstin · Pull Request #2035 · pypa/packaging.python.org

konstin · 2026-04-20T19:34:57Z

Fixes #1998

Currently, linkcheck is blocking PRs if any link in the project isn't accessible. Due to the high number of links, this regularly blocks unrelated PRs (see #2018 for a recent example).

This PR moves linkcheck to a separate, non-blocking check. It will still run on PRs and report a failure if a newly added link 404s (#1998 (comment)), but an unrelated failure won't block PRs anymore. We also run the check on a daily cron, so that links going away is surfaced separately from PRs.

📚 Documentation preview 📚: https://python-packaging-user-guide--2035.org.readthedocs.build/en/2035/

Fixes pypa#1998 Currently, linkcheck is blocking PRs if any link in the project isn't accessible. Due to the high number of links, this regularly blocks unrelated PRs (see pypa#2018 for a recent example). This PR moves linkcheck to a separate, non-blocking check. It will still run on PRs and report a failure if a newly added link 404s (pypa#1998 (comment)), but an unrelated failure won't block PRs anymore. We also run the check on a daily cron, so that links going away is surfaced separately from PRs.

webknjaz · 2026-04-21T09:19:55Z

I'd rather block PRs on newly added broken links. How can we detect this? Maybe store the Sphinx builder's structured output as an artifact and exclude those links that existed prior to PRs?

konstin · 2026-04-21T13:13:17Z

I'm not aware of any tool or function that would support only blocking on new URLs, but we can add it separately. This PR is about not blocking working PRs on a broken linkcheck, including things such as getting changes from accepted PEPs linkcheck.

woodruffw · 2026-04-21T13:15:56Z

My 0.02c is that we can rely on maintainer discretion here -- maintainers should still review the linkcheck job when it fails (especially when a PR has new links being added), but can override it when it fails for transient reasons (like these recent kinds of cert issues).

webknjaz · 2026-04-21T14:33:42Z

My 0.02c is that we can rely on maintainer discretion here -- maintainers should still review the linkcheck job when it fails (especially when a PR has new links being added), but can override it when it fails for transient reasons (like these recent kinds of cert issues).

Yes. But my concern is the additional maintenance burden. When a job isn't marked as required, the contributors may be tempted to take the easy way out and rely on the maintainer to tell them that it actually should pass as opposed as the UI showing them that they need to look into it, which may happen w/o the maintainer's involvement.

webknjaz · 2026-04-21T14:40:22Z

I'm not aware of any tool or function that would support only blocking on new URLs, but we can add it separately. This PR is about not blocking working PRs on a broken linkcheck, including things such as getting changes from accepted PEPs linkcheck.

So linkcheck produces JSONL like this: https://github.com/ansible/awx-plugins/actions/runs/24041509767#summary-70114297278. If we have that pre-downloaded into the job from artifacts/cache, we could have docs/conf.py read it and extract all the known URLs.

Then, linkcheck_ignore could be populated with all those known URLs either right on the module level or from within the setup() function, but only in the context of PRs (and merge queues). This way, the PR runs would exclude checks against everything that used to be known prior to the PR merge, and new things would still be checked.

ncoghlan · 2026-04-27T23:29:16Z

As one of the other other reviewers, I don't think pushing the burden of mitigating the risk of invalid links back to human reviewers is a good idea.

Another potential source of inspiration for ways to keep link rot from blocking unrelated PRs would be diff cover.

konstin · 2026-04-27T23:35:34Z

The linkcheck still runs, and it is still reported. What this PR changes is it that it stops blocking unrelated PRs in a way that can't be mitigated, as the check becomes optional instead of mandatory. The current situation is that all PEP authors and spec maintainers are randomly affected by this.

If we specifically want to ensure that linked content is still available, we should archive it in the internet archive and provide an archived link next to live link, as e.g. wikipedia does: https://en.wikipedia.org/wiki/Python_(programming_language)#References

webknjaz · 2026-04-27T23:41:27Z

in a way that can't be mitigated, as the check becomes optional instead of mandatory.

In my previous comment, I offered an idea of exactly how to mitigate this case..

konstin · 2026-04-27T23:53:51Z

I can offer this PR as an improvement, because I'm trying to improve specs maintenance and this is a frequent problem, but implementing a new linkcheck logic here is out of scope for me.

woodruffw · 2026-04-28T19:51:29Z

I agree with @konstin here -- I think in principle it's good for us to block on linkcheck failures, but in practice it's exactly the kind of maintenance burden we're discussing trying to avoid now.

Speaking personally, the stochastic linkcheck failures are a major drag on my willingness to contribute to this repo; they make every PR take longer (as in days-to-weeks instead of hours-to-days) than they should, and they act as a forced "context switch" whenever they pop up (making me go from doing whatever I was working on to figuring out why a specific link is now dead).

I think this is especially unfortunate when most of the links in this repo probably aren't being clicked by users; most of the breakage happens with what I suspect are our "coldest" links (old SourceForge pages, etc.) rather than the "hottest" ones (PEPs, GitHub, etc.).

TL;DR: I think we should evaluate another automated solution here (@ncoghlan's idea of doing it by diff cover makes a lot of sense to me!), but in the mean time I'm in favor of making linkcheck a soft-only failure on PRs.

(If it helps convince anyone else, I'm willing to fix links when they fail on main -- I don't mind doing that at all! It's primarily the context switch in the middle of a PR that's frustrating.)

webknjaz · 2026-04-28T21:11:11Z

@woodruffw if you're willing to take in the burden, I guess we could do this. However, can anyone explain me why the linkcheck ignore idea doesn't seem to be considered? I don't understand if I'm missing anything.

Additionally, I think that having a workflow copy is the wrong way of implementing this. alls-greens official recommendation is to use a dynamically computed continue-on-error in the job matrix. Could you change the implementation to do this instead of introducing another workflow with hardcoded steps — this is something I've been wanting to avoid.

pradyunsg

Let's unblock merging PRs, and also file an issue to improve this to do the smarter thing that @webknjaz suggests. :)

woodruffw · 2026-04-30T12:45:48Z

Yeah, sorry if I was unclear @webknjaz -- your approach also sounds good to me! I'm in favor of just about anything that would automate this further, my main opinion with just this PR in particular was that we shouldn't block it on a more general solution (since broken links are a current and ongoing maintenance drag).

konstin force-pushed the konsti/linkcheck-on-main branch from 5692294 to 3bfdecb Compare April 20, 2026 19:41

Separate cancel group

3dcce68

woodruffw approved these changes Apr 20, 2026

View reviewed changes

Comment thread .github/workflows/linkcheck.yml

konstin added 2 commits April 21, 2026 09:54

Remove unused github token from test.yml

efca0b6

Merge branch 'main' into konsti/linkcheck-on-main

50f5435

pradyunsg approved these changes Apr 30, 2026

View reviewed changes

Conversation

konstin commented Apr 20, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

webknjaz commented Apr 21, 2026

Uh oh!

konstin commented Apr 21, 2026

Uh oh!

woodruffw commented Apr 21, 2026

Uh oh!

webknjaz commented Apr 21, 2026

Uh oh!

webknjaz commented Apr 21, 2026

Uh oh!

ncoghlan commented Apr 27, 2026

Uh oh!

konstin commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

webknjaz commented Apr 27, 2026

Uh oh!

konstin commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

woodruffw commented Apr 28, 2026

Uh oh!

webknjaz commented Apr 28, 2026

Uh oh!

pradyunsg left a comment

Choose a reason for hiding this comment

Uh oh!

woodruffw commented Apr 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

konstin commented Apr 20, 2026 •

edited by github-actions Bot

Loading

konstin commented Apr 27, 2026 •

edited

Loading

konstin commented Apr 27, 2026 •

edited

Loading