Migrate to archive.sparkpost.com with host-conditional noindex by PauloJeunonSousa · Pull Request #855 · SparkPost/support-docs

PauloJeunonSousa · 2026-06-25T14:28:07Z

Summary

Migrate the docs site to archive.sparkpost.com so the existing support.sparkpost.com hostname is free to be repointed at the new CloudFront redirect distribution (messagebird-dev/bird#2382, merged) that 301s every legacy URL to its closest bird.com counterpart. End goal: consolidate SEO link equity onto bird.com instead of fragmenting between two sites.

This PR is half of a two-part change. The other half is the CloudFront distribution (bird#2382). They have to land together for SEO — landing this PR alone (or far ahead of CloudFront) would deindex support.sparkpost.com while it still has all its inbound Google traffic, losing link equity that would otherwise transfer to bird.com. The edge function added here exists specifically to bridge that gap so the two pieces don't have to land simultaneously.

Changes

1. Base URL change + noindex (for `archive.sparkpost.com`)

The new archive hostname needs to be configured everywhere and emit clear noindex signals so search engines drop it (and stop competing with bird.com).

next-sitemap.js: default siteUrl → https://archive.sparkpost.com; robots.txt generation adds Disallow: / to the * policy.
components/site/seo.tsx: <meta name="robots"> content → noindex, nofollow (was index, follow, max-image-preview:large, ...).
netlify.toml: add X-Robots-Tag: noindex, nofollow to the global [[headers]] block; CSP font-src support.sparkpost.com → archive.sparkpost.com (the prior value was hardcoded but functionally unused).

2. Host-conditional noindex (preserves `support.sparkpost.com` indexability during the gap)

Without this, the changes above would apply equally to both hostnames the Netlify deploy serves — including the current production support.sparkpost.com. That's incorrect during the period between this PR landing and the CloudFront cutover: support.sparkpost.com still has all its Google traffic, and deindexing it before the 301s exist loses equity rather than transferring it.

netlify/edge-functions/host-conditional-noindex.ts (new) — a Netlify Edge Function that runs on every request and, for Host: support.sparkpost.com only:
- strips X-Robots-Tag from response headers
- strips <meta name="robots" content="noindex,..."> from HTML responses
- overrides /robots.txt with a permissive version (User-agent: *, no Disallow)

archive.sparkpost.com and all other hosts (deploy previews, *.netlify.app, etc.) pass through unmodified — noindex stays in place.

After CloudFront cutover (out of scope here)

Once support.sparkpost.com DNS flips to the CloudFront distribution, Netlify never sees those requests again — every viewer hits CloudFront and gets 301'd to bird.com. At that point the edge function is dead code on the Netlify side. Remove it (and the netlify.toml comment pointing at it) in a small follow-up PR.

Test plan

After deploy preview is up:

curl -I https://archive.sparkpost.com/docs/ → response includes X-Robots-Tag: noindex, nofollow
curl -I https://support.sparkpost.com/docs/ → response does NOT include X-Robots-Tag
curl -s https://archive.sparkpost.com/robots.txt → contains Disallow: /
curl -s https://support.sparkpost.com/robots.txt → permissive (User-agent: *), no Disallow
View source on https://archive.sparkpost.com/docs/... → contains <meta name="robots" content="noindex, nofollow">
View source on https://support.sparkpost.com/docs/... → does NOT contain that meta tag
Cypress suite passes in CI (no test asserts on the old robots meta string)

🤖 Generated with Claude Code

Note

Medium Risk
SEO and crawl behavior depend on correct Host-header branching in the edge function; a bug could deindex support.sparkpost.com or leave the archive indexed.

Overview
Repoints the docs site’s default canonical URL to archive.sparkpost.com and applies noindex everywhere the build emits SEO signals: robots meta in seo.tsx, global X-Robots-Tag in netlify.toml, Disallow: / in generated robots.txt via next-sitemap.js, plus CSP font-src updated from support.sparkpost.com to archive.sparkpost.com.

Because one Netlify deploy still serves support.sparkpost.com until CloudFront/DNS cutover, a new Netlify edge function (host-conditional-noindex.ts) runs only for that host: it serves a permissive /robots.txt, sets X-Robots-Tag: all, and strips the noindex robots <meta> from HTML so production support URLs stay crawlable and don’t lose equity before 301s to bird.com. archive.sparkpost.com and other hosts pass through with noindex unchanged.

tsconfig.json excludes the netlify/ folder so edge TypeScript isn’t typechecked with the Next app.

^{Reviewed by Cursor Bugbot for commit 54d49ea. Bugbot is set up for automated code reviews on this repo. Configure here.}

Moves the docs site from support.sparkpost.com to archive.sparkpost.com so the content can be archived (and later 301'd to bird.com) without competing with bird.com in search. Adds noindex/nofollow at three layers (robots.txt, <meta name="robots">, X-Robots-Tag header) so search engines drop the archive. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

netlify · 2026-06-25T14:28:14Z

✅ Deploy Preview for support-docs ready!

Name	Link
🔨 Latest commit	`54d49ea`
🔍 Latest deploy log	https://app.netlify.com/projects/support-docs/deploys/6a3ee4d598178f000863e2d1
😎 Deploy Preview	https://deploy-preview-855--support-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

The base-URL + noindex change in this branch applies to every hostname this deploy serves. During the gap between this PR landing and the CloudFront redirect distribution going live for support.sparkpost.com, that's incorrect for SEO: archive.sparkpost.com SHOULD be noindex (it's the archived copy), but support.sparkpost.com SHOULD still be indexed so its existing link equity is preserved until CloudFront can 301 it to bird.com counterparts. If support is noindex'd during the gap, Google can't see the eventual 301s (robots.txt Disallow blocks re-crawl), so equity that would otherwise transfer to bird.com is lost instead. The longer the gap, the more decays. Add a Netlify Edge Function that runs on every request and, for requests with Host: support.sparkpost.com: - serves a permissive /robots.txt (no Disallow:/) - strips X-Robots-Tag from response headers - strips the <meta name="robots" content="noindex,..."> tag from HTML archive.sparkpost.com and all other hosts (deploy previews, *.netlify.app) pass through unmodified — noindex stays in place. REMOVE this function (and the netlify.toml comment pointing at it) once the CloudFront cutover is complete. After that, support.sparkpost.com no longer hits Netlify, so the hostname-conditional logic becomes dead code. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

Want reviews to match your repository better? Bugbot Learning can learn team-specific rules from PR activity. A team admin can enable Learning in the Cursor dashboard.

^{Reviewed by Cursor Bugbot for commit 56fa499. Configure here.}

Discovered via local netlify dev testing: `response.headers.delete('X-Robots-Tag')` is silently ignored when the header is set in netlify.toml's [[headers]] block. Netlify re-applies the netlify.toml headers after the edge function returns, overriding any deletions — but it respects values the function explicitly sets. Switch from delete() to set('X-Robots-Tag', 'all'). Google treats X-Robots-Tag: all as equivalent to no header (canonical "ignore any prior noindex; index normally"), so the effect is identical to the intended deletion. Other header sets (X-Edge-Probe sentinel during testing) confirmed mutations survive — only deletes of netlify.toml-sourced headers are eaten. Verified locally with netlify dev across the full host × content-type matrix: - archive.sparkpost.com HTML: X-Robots-Tag: noindex, nofollow + meta intact ✓ - support.sparkpost.com HTML: X-Robots-Tag: all + meta stripped ✓ - archive.sparkpost.com /robots.txt: Disallow: / served as built ✓ - support.sparkpost.com /robots.txt: permissive (User-agent: *) ✓ - support.sparkpost.com JSON: X-Robots-Tag: all + body untouched ✓ - archive.sparkpost.com JSON: X-Robots-Tag: noindex + body untouched ✓ - Deploy preview host (deploy-preview-*.netlify.app): noindex preserved ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…m tsc Two issues flagged in the PR: 1. Cursor Bugbot (high severity): after rewriting an HTML response body in the edge function for support.sparkpost.com, only `content-length` was removed from the copied headers. If the origin's response was gzip- or brotli-encoded (which Netlify's CDN does opportunistically), the `response.text()` call decoded the body to plain text but the stale `Content-Encoding` header survived. Clients would then try to decompress plain text and fail. Fix: also delete `content-encoding` when rebuilding the response. 2. Build failure under tsc: `import type { Context } from '@netlify/edge-functions'` fails Next.js's TypeScript check because the package is only present in the Netlify edge build environment, not in node_modules. Edge functions run on Deno with Netlify-provided globals, not Node, so they're a separate compilation unit. Add `netlify` to tsconfig.json's `exclude` so `next build` (and the Cypress CI's prebuild) stops scanning the directory. Netlify's edge function build still typechecks the files on the server side with the correct types. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

github-actions Bot added the 👀 needs review label Jun 25, 2026

rafaelmatsumotomb approved these changes Jun 25, 2026

View reviewed changes

viniciusgiles approved these changes Jun 25, 2026

View reviewed changes

balupillai approved these changes Jun 25, 2026

View reviewed changes

PauloJeunonSousa changed the title ~~Switch base URL to archive.sparkpost.com and noindex the site~~ Migrate to archive.sparkpost.com with host-conditional noindex Jun 26, 2026

cursor Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread netlify/edge-functions/host-conditional-noindex.ts

rafaelmatsumotomb approved these changes Jun 26, 2026

View reviewed changes

viniciusgiles approved these changes Jun 26, 2026

View reviewed changes

PauloJeunonSousa merged commit 5b4c068 into main Jun 26, 2026
5 of 6 checks passed

PauloJeunonSousa deleted the archive-sparkpost-noindex branch June 26, 2026 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Migrate to archive.sparkpost.com with host-conditional noindex#855

Migrate to archive.sparkpost.com with host-conditional noindex#855
PauloJeunonSousa merged 4 commits into
mainfrom
archive-sparkpost-noindex

PauloJeunonSousa commented Jun 25, 2026 •

edited by cursor Bot

Loading

Uh oh!

netlify Bot commented Jun 25, 2026 •

edited

Loading

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

PauloJeunonSousa commented Jun 25, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

1. Base URL change + noindex (for archive.sparkpost.com)

2. Host-conditional noindex (preserves support.sparkpost.com indexability during the gap)

After CloudFront cutover (out of scope here)

Test plan

Uh oh!

netlify Bot commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for support-docs ready!

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

PauloJeunonSousa commented Jun 25, 2026 •

edited by cursor Bot

Loading

1. Base URL change + noindex (for `archive.sparkpost.com`)

2. Host-conditional noindex (preserves `support.sparkpost.com` indexability during the gap)

netlify Bot commented Jun 25, 2026 •

edited

Loading