[linkcheck] Clear residual false positives in weekly lychee report#934
Merged
Conversation
The weekly link checker (#933) flags 8 errors out of ~25k links, all false positives or harmless artifacts on non-content pages: - IEEE Xplore returns "202 Accepted" (anti-bot) for a valid DOI cited in zreferences.html -> add 202 to --accept. - genindex / search / prf-prf are auto-generated utility pages with no source notebook, so the theme's "Download Notebook" button points at a nonexistent _notebooks/<page>.ipynb and renders a second href="None" -> --exclude-path those three pages. - A Journal of Derivatives DOI redirects into a login/paywall loop that exceeds max-redirects; the citation itself is valid -> --exclude it. Configuration is kept inline in the workflow args (rather than a lychee.toml) because lychee runs against the gh-pages checkout, which does not contain repo-root config files. Closes #933 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR updates the scheduled Link Checker (lychee) GitHub Action configuration to eliminate recurring false positives in the weekly link-check report for the built gh-pages site, so that future non-zero reports better reflect real link issues.
Changes:
- Expands accepted HTTP status codes to include
202to accommodate certain valid DOI targets. - Excludes specific auto-generated utility pages (
genindex.html,search.html,prf-prf.html) from link checking to avoid theme-generated “Download Notebook” artifacts. - Excludes a specific DOI that redirects into a paywall/login loop and exceeds redirect limits.
📖 Netlify Preview Ready!Preview URL: https://pr-934--sunny-cactus-210e3e.netlify.app Commit: 📚 Changed LecturesBuild Info
|
lychee treats --exclude-path values as regular expressions, so the
unescaped dots in genindex.html / search.html / prf-prf.html were regex
wildcards and the patterns were unanchored. Escape the dot and anchor the
end ('<name>\.html$') so each matches only the intended generated page.
Addresses Copilot review feedback on #934.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
mmcky
added a commit
that referenced
this pull request
Jun 26, 2026
* [linkcheck] Clear residual false positives in weekly lychee report The weekly link checker (#933) flags 8 errors out of ~25k links, all false positives or harmless artifacts on non-content pages: - IEEE Xplore returns "202 Accepted" (anti-bot) for a valid DOI cited in zreferences.html -> add 202 to --accept. - genindex / search / prf-prf are auto-generated utility pages with no source notebook, so the theme's "Download Notebook" button points at a nonexistent _notebooks/<page>.ipynb and renders a second href="None" -> --exclude-path those three pages. - A Journal of Derivatives DOI redirects into a login/paywall loop that exceeds max-redirects; the citation itself is valid -> --exclude it. Configuration is kept inline in the workflow args (rather than a lychee.toml) because lychee runs against the gh-pages checkout, which does not contain repo-root config files. Closes #933 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * [linkcheck] Escape and anchor --exclude-path regexes lychee treats --exclude-path values as regular expressions, so the unescaped dots in genindex.html / search.html / prf-prf.html were regex wildcards and the patterns were unanchored. Escape the dot and anchor the end ('<name>\.html$') so each matches only the intended generated page. Addresses Copilot review feedback on #934. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * Force on-demand GPU runners (spot=false) AWS spot reclamation has been interrupting the g4dn.2xlarge GPU notebook builds mid-run, discarding the whole build. Add spot=false to the RunsOn runner spec in all four GPU workflows (cache, ci, collab, publish) so they run on on-demand instances. Rolls out the org-wide decision in QuantEcon/meta#330; mirrors QuantEcon/lecture-jax#327. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The weekly Link Checker action (lychee) filed #933 with 8 errors out of 24,997 links checked (24,934 OK). I verified every one against the live
python.quantecon.orgsite and the live URLs — none is a broken link in any lecture. They are all false positives or harmless theme artifacts on non-content pages, in two groups, both cleared here by tightening the lychee flags in.github/workflows/linkcheck.yml.For context, the previous report (#906) had 241 errors that were fixed by adding
--root-dir; this PR clears the residual 8 so future weekly reports come back green and any non-empty report becomes a real signal rather than recurring noise.The 8 errors and how each is handled
zreferences.html10.1109/TAC.1977.1101561→[202]202 Accepted(anti-bot) for a valid link202to--acceptzreferences.html10.3905/jod.2012.20.1.038→[302]--excludethat DOIgenindex.html_notebooks/genindex.ipynb+None.ipynband renders a secondhref="None"--exclude-pathsearch.html_notebooks/search.ipynb+None--exclude-pathprf-prf.html_notebooks/prf-prf.ipynb+None--exclude-pathWhy inline
argsinstead of alychee.tomlThe workflow checks out the built
gh-pagessite into the workspace and runs lychee there, so alychee.tomlcommitted tomainwould not be present in that checkout and would be silently ignored. (Worth a heads-up:lecture-python-advanced.mystkeeps itslychee.tomlonmainwhile lychee runs againstgh-pages, so that config may currently be inert — likely why its reports stay open. Happy to file a follow-up there.) Keeping the flags inline in the workflowargs— which are read from the default branch at trigger time, independent of the gh-pages workspace — guarantees they are applied, consistent with the existing--root-dir/--acceptflags.Verification / caveat
The link check only runs on a weekly schedule (or manual
workflow_dispatch), so it cannot be exercised by ordinary PR CI. I validated that the workflow YAML parses and that the foldedargscollapse to the intended single command line. The change is a strict superset of the current working flags (adds202to accept plus the excludes), so it cannot regress existing coverage. Suggest a manual Run workflow dispatch after merge to confirm a clean, zero-error report.Closes #933
🤖 Generated with Claude Code