Fix Python code comments mistaken for embedded notebook titles by cderv · Pull Request #14643 · quarto-dev/quarto-cli

cderv · 2026-07-01T13:23:45Z

When an embedded Jupyter notebook ({{< embed notebook.ipynb >}}) has no real markdown heading, a Python comment inside a code cell (e.g. # plt.savefig(...)) could get picked up as the notebook's title in the sidebar and "Source" button.

This supersedes #14639, which fixed the same issue by blocking title promotion in findTitle whenever any content preceded the heading. That patches the symptom at one call site and also drops legitimate titles for embeds that have prose before a real heading. If this merges, #14639 should be closed instead of merged — its earlier commits are included in this branch's history, but the later commits here replace its core mechanism.

Root Cause

The shared scanner markdownWithExtractedHeading (src/core/pandoc/pandoc-partition.ts) treated any #-prefixed line as an ATX heading, with no awareness of fenced code blocks. findTitle scans every cell's markdown, not just code cells, so this also reaches markdown cells, whose authored content can contain a fenced block nested in a list item.

Fix

Make the scanner itself fence-aware: track CommonMark ``` / ~~~ fence open/close state and skip heading detection while inside a fence. Fence markers are recognized indented up to 3 spaces, matching CommonMark's own limit. A real heading anywhere in the notebook is still promoted, same as before this bug — only a fenced line that merely looks like a heading is skipped. This fixes every caller of the scanner, not just the embed path.

Fixes #14577

…tle (#14577) When embedding a notebook with no YAML title and no markdown heading cell, the "Source" link and sidebar showed a Python comment (e.g. `# plt.savefig(...)`) instead of the notebook filename. findTitle() in jupyter-embed.ts scans every cell's markdown, including code cells whose markdown is their fenced source. partitionMarkdown is not fence-aware, so a `#` comment line inside the fence matches the ATX-heading regex and was returned as the title. The non-embed code path already guards against this: the Dec-2023 "Improve title snipping in notebooks" work (24672a4, fixes #5363/#6411) made fixupFrontMatter only promote a heading to a notebook title when no content precedes it. findTitle predates that change and was never brought in line, so it still promoted any heading-like line. Apply the same `!contentBeforeHeading` gate here by exposing contentBeforeHeading (already computed by markdownWithExtractedHeading) through PartitionedMarkdown. A fenced code block's opening delimiter always counts as content before the comment, so the comment is no longer mistaken for a title and the filename is used instead.

The teardowns called raw Deno.removeSync, which throws NotFound when a preview artifact is already absent (e.g. a prior teardown partially ran, or a sibling test cleaned a shared path). A throwing teardown aborts the test and leaves the fixture directory half-cleaned, so the next run trips over stale state. safeRemoveSync (deno_ral/fs.ts) tolerates already-removed paths and only rethrows genuine errors, keeping cleanup idempotent.

postRenderCleanup registered entries were removed with a non-recursive Deno.removeSync, so an entry could only be a single file. Embedded-notebook tests produce a `*_files` support directory alongside the preview HTML, which could not be declared for cleanup and was left behind. Use safeRemoveSync with recursive removal so a registered entry can be a directory as well as a file.

The regression test for the embedded-notebook title fix started as a bespoke testRender block in render-embed.test.ts with manual teardown of the preview artifacts. smoke-all is the established home for issue-regression tests and its framework already handles output cleanup, so move it there: a 14577.qmd that embeds notebook.ipynb and asserts the Source link uses the filename, not the Python comment, with the preview artifacts declared for postRenderCleanup.

These asserted pre-existing, unchanged extraction behavior plus a trivial pass-through (partitionMarkdown now forwards contentBeforeHeading, a value markdownWithExtractedHeading already computed). They pinned behavior rather than exercising the fix. The fix lives in findTitle and is covered end-to-end by the smoke-all regression test, so the unit additions added no real coverage.

Record why findTitle can't mirror fixupFrontMatter's markdown-cells-only guard: it runs on rendered JupyterCellOutput, which carries no cell_type, so the !contentBeforeHeading check is the available equivalent for excluding a code cell's fenced source.

Add a smoke-all test for an embedded notebook whose only heading has prose before it: the heading must not become the embed title, which falls back to the filename. This is the same rule fixupFrontMatter applies to a notebook's own front-matter title since 24672a4 (#5363/#6411); the embed title now aligns with it. Broaden the changelog entry to describe this general behavior, not just the code-comment case.

…hared gate The comment and changelog claimed findTitle now derives the title the same way fixupFrontMatter does. Only the content-before-heading gate is shared: findTitle still scans every cell (fixupFrontMatter inspects only the first markdown cell) and cannot filter on cell_type, so the two are not equivalent. State that the gate is borrowed and list the divergences, so the comparison is not read as a guarantee of identical behavior.

…t heading extraction is fence-aware

…itles

…tionMarkdown Prevents a future reader from removing the seemingly-unused field and breaking fixupFrontMatter's own gate, flagged in final branch review.

…f the code Both comments explained the fix's history and cited the issue number rather than something non-obvious about the code itself; that context belongs in the commit trail, not the source.

…Heading's other cases Prior tests exercised the ATX heading path only indirectly through partitionMarkdown, and the fence work this branch added only covered fenced-block cases. Setext headings, the no-heading case, and contentBeforeHeading's false branch had no direct test.

CommonMark allows fences indented up to 3 spaces, but every fence this scanner sees comes from code cell source, which always starts at column 0.

posit-snyk-bot · 2026-07-01T13:24:14Z

✅ Snyk checks have passed. No issues have been found so far.

Status	Scan Engine	Critical	High	Medium	Low	Total (0)
✅	Open Source Security	0	0	0	0	0 issues
✅	Licenses	0	0	0	0	0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

markdownWithExtractedHeading scanned every notebook cell, not just code cells. Code cells always emit fences at column 0, but markdown cells carry arbitrary user-authored content, where a fenced block nested in a list item can have its fence markers indented while a line inside stays at column 0. That line was previously invisible to fence tracking and could still be mistaken for a heading.

Distinct from the indented-open case already covered: opening fence at column 0, closing marker indented, followed by a real heading. Confirms the indented fence-close fix also resolves the case where a heading after such a fence would otherwise never be reached.

cderv · 2026-07-01T14:56:09Z

closes #14639

cderv added 16 commits June 30, 2026 18:14

fix(embed): make markdownWithExtractedHeading fence-aware

076dfe2

fix(embed): drop the contentBeforeHeading gate from findTitle now tha…

a0f4440

…t heading extraction is fence-aware

test(embed): assert prose-before-heading is still promoted in embed t…

dea92f0

…itles

docs(changelog): describe the fence-aware #14577 fix

c800228

docs(pandoc): note why contentBeforeHeading is not forwarded by parti…

283629d

…tionMarkdown Prevents a future reader from removing the seemingly-unused field and breaking fixupFrontMatter's own gate, flagged in final branch review.

refactor(embed): trim inline comments that restate the diff instead o…

2967a28

…f the code Both comments explained the fix's history and cited the issue number rather than something non-obvious about the code itself; that context belongs in the commit trail, not the source.

docs(pandoc): note indented fences are intentionally out of scope

6b309f2

CommonMark allows fences indented up to 3 spaces, but every fence this scanner sees comes from code cell source, which always starts at column 0.

cderv added 4 commits July 1, 2026 15:29

Remove unnecessary comments

9d9284a

Adapt changelog

1a430de

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix Python code comments mistaken for embedded notebook titles#14643

Fix Python code comments mistaken for embedded notebook titles#14643
cderv wants to merge 20 commits into
mainfrom
fix/issue-14577-option-b

cderv commented Jul 1, 2026 •

edited

Loading

Uh oh!

posit-snyk-bot commented Jul 1, 2026 •

edited

Loading

Uh oh!

cderv commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

cderv commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Root Cause

Fix

Uh oh!

posit-snyk-bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Snyk checks have passed. No issues have been found so far.

Uh oh!

cderv commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cderv commented Jul 1, 2026 •

edited

Loading

posit-snyk-bot commented Jul 1, 2026 •

edited

Loading