Skip to content

[linkcheck] Clear residual false positives in weekly lychee report#934

Merged
mmcky merged 2 commits into
mainfrom
linkcheck-clear-933-false-positives
Jun 26, 2026
Merged

[linkcheck] Clear residual false positives in weekly lychee report#934
mmcky merged 2 commits into
mainfrom
linkcheck-clear-933-false-positives

Conversation

@mmcky

@mmcky mmcky commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

The weekly Link Checker action (lychee) filed #933 with 8 errors out of 24,997 links checked (24,934 OK). I verified every one against the live python.quantecon.org site and the live URLs — none is a broken link in any lecture. They are all false positives or harmless theme artifacts on non-content pages, in two groups, both cleared here by tightening the lychee flags in .github/workflows/linkcheck.yml.

For context, the previous report (#906) had 241 errors that were fixed by adding --root-dir; this PR clears the residual 8 so future weekly reports come back green and any non-empty report becomes a real signal rather than recurring noise.

The 8 errors and how each is handled

Page Reported Reality (verified) Fix
zreferences.html 10.1109/TAC.1977.1101561[202] Resolves to IEEE Xplore, which returns 202 Accepted (anti-bot) for a valid link add 202 to --accept
zreferences.html 10.3905/jod.2012.20.1.038[302] Journal of Derivatives DOI redirects into a login/paywall loop (exceeds max-redirects); citation is valid --exclude that DOI
genindex.html _notebooks/genindex.ipynb + None Auto-generated index page — no source notebook, so the theme's "Download Notebook" button points at a nonexistent .ipynb and renders a second href="None" --exclude-path
search.html _notebooks/search.ipynb + None Auto-generated search page — same as above --exclude-path
prf-prf.html _notebooks/prf-prf.ipynb + None Auto-generated proof index page — same as above --exclude-path

Why inline args instead of a lychee.toml

The workflow checks out the built gh-pages site into the workspace and runs lychee there, so a lychee.toml committed to main would not be present in that checkout and would be silently ignored. (Worth a heads-up: lecture-python-advanced.myst keeps its lychee.toml on main while lychee runs against gh-pages, so that config may currently be inert — likely why its reports stay open. Happy to file a follow-up there.) Keeping the flags inline in the workflow args — which are read from the default branch at trigger time, independent of the gh-pages workspace — guarantees they are applied, consistent with the existing --root-dir / --accept flags.

Verification / caveat

The link check only runs on a weekly schedule (or manual workflow_dispatch), so it cannot be exercised by ordinary PR CI. I validated that the workflow YAML parses and that the folded args collapse to the intended single command line. The change is a strict superset of the current working flags (adds 202 to accept plus the excludes), so it cannot regress existing coverage. Suggest a manual Run workflow dispatch after merge to confirm a clean, zero-error report.

Closes #933

🤖 Generated with Claude Code

The weekly link checker (#933) flags 8 errors out of ~25k links, all
false positives or harmless artifacts on non-content pages:

- IEEE Xplore returns "202 Accepted" (anti-bot) for a valid DOI cited in
  zreferences.html -> add 202 to --accept.
- genindex / search / prf-prf are auto-generated utility pages with no
  source notebook, so the theme's "Download Notebook" button points at a
  nonexistent _notebooks/<page>.ipynb and renders a second href="None"
  -> --exclude-path those three pages.
- A Journal of Derivatives DOI redirects into a login/paywall loop that
  exceeds max-redirects; the citation itself is valid -> --exclude it.

Configuration is kept inline in the workflow args (rather than a
lychee.toml) because lychee runs against the gh-pages checkout, which
does not contain repo-root config files.

Closes #933

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 26, 2026 05:32

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the scheduled Link Checker (lychee) GitHub Action configuration to eliminate recurring false positives in the weekly link-check report for the built gh-pages site, so that future non-zero reports better reflect real link issues.

Changes:

  • Expands accepted HTTP status codes to include 202 to accommodate certain valid DOI targets.
  • Excludes specific auto-generated utility pages (genindex.html, search.html, prf-prf.html) from link checking to avoid theme-generated “Download Notebook” artifacts.
  • Excludes a specific DOI that redirects into a paywall/login loop and exceeds redirect limits.

Comment thread .github/workflows/linkcheck.yml Outdated
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

lychee treats --exclude-path values as regular expressions, so the
unescaped dots in genindex.html / search.html / prf-prf.html were regex
wildcards and the patterns were unanchored. Escape the dot and anchor the
end ('<name>\.html$') so each matches only the intended generated page.

Addresses Copilot review feedback on #934.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@mmcky mmcky merged commit 63e137c into main Jun 26, 2026
1 check passed
@mmcky mmcky deleted the linkcheck-clear-933-false-positives branch June 26, 2026 06:58
mmcky added a commit that referenced this pull request Jun 26, 2026
* [linkcheck] Clear residual false positives in weekly lychee report

The weekly link checker (#933) flags 8 errors out of ~25k links, all
false positives or harmless artifacts on non-content pages:

- IEEE Xplore returns "202 Accepted" (anti-bot) for a valid DOI cited in
  zreferences.html -> add 202 to --accept.
- genindex / search / prf-prf are auto-generated utility pages with no
  source notebook, so the theme's "Download Notebook" button points at a
  nonexistent _notebooks/<page>.ipynb and renders a second href="None"
  -> --exclude-path those three pages.
- A Journal of Derivatives DOI redirects into a login/paywall loop that
  exceeds max-redirects; the citation itself is valid -> --exclude it.

Configuration is kept inline in the workflow args (rather than a
lychee.toml) because lychee runs against the gh-pages checkout, which
does not contain repo-root config files.

Closes #933

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* [linkcheck] Escape and anchor --exclude-path regexes

lychee treats --exclude-path values as regular expressions, so the
unescaped dots in genindex.html / search.html / prf-prf.html were regex
wildcards and the patterns were unanchored. Escape the dot and anchor the
end ('<name>\.html$') so each matches only the intended generated page.

Addresses Copilot review feedback on #934.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* Force on-demand GPU runners (spot=false)

AWS spot reclamation has been interrupting the g4dn.2xlarge GPU notebook
builds mid-run, discarding the whole build. Add spot=false to the RunsOn
runner spec in all four GPU workflows (cache, ci, collab, publish) so they
run on on-demand instances.

Rolls out the org-wide decision in QuantEcon/meta#330; mirrors
QuantEcon/lecture-jax#327.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Link Checker Report

2 participants