Skip to content

SEO

How the PWA talks to search engines. The goal is that on cutover day we flip a single env var and Google starts indexing the new site with clean canonicals, rich results, and a sitemap — no last-minute scramble.

The indexability switch

Two env vars control everything:

SEO_SITE_URL=https://fruitplug.co.uk   # canonical host
SEO_INDEXABLE=0                         # flip to 1 on production

When SEO_INDEXABLE is off (default, used on dev):

  • robots.txt returns User-Agent: * · Disallow: /
  • Every page emits <meta name="robots" content="noindex, nofollow">
  • sitemap.xml renders empty

When it's on (production):

  • robots.txt allows crawling, disallows /api/, /cart, /checkout, /account
  • Pages emit index, follow
  • sitemap.xml lists static pages, every product, every category, every box-builder template
  • robots.txt advertises the sitemap

SEO_SITE_URL is the canonical host everywhere — every <link rel="canonical">, og:url, JSON-LD url, sitemap entry, and the metadataBase in the root layout all come from it. Even on dev, canonicals point at fruitplug.co.uk so any accidental crawl attributes ranking to the real site.

Structured data (JSON-LD)

Rich results eligibility — every schema below renders as <script type="application/ld+json"> server-side.

Surface Schemas emitted
/ (home) Organization + WebSite (with SearchAction)
/p/[slug] (PDP) Product (with Offer, AggregateRating if reviews exist) + BreadcrumbList
/shop/[category] BreadcrumbList + CollectionPage

Helpers:

  • Builders: apps/web/lib/seo/structured-data.tsorganizationSchema(), websiteSchema(), productSchema(product), breadcrumbSchema(crumbs), collectionPageSchema(params).
  • Renderer: apps/web/components/seo/JsonLd.tsx<JsonLd data={...} />. Serializes, escapes < to prevent HTML injection, supports one schema or an array.

When you add a new page template that deserves structured data, build the schema in structured-data.ts, then drop <JsonLd data={...} /> at the top of the component.

Canonicals + OG

Every SSR page (home, shop, category, PDP, box-builder) sets:

  • alternates: { canonical: canonicalUrl("/path") }
  • openGraph: { url, title, description, images, type, ... }

metadataBase is set once in the root layout from SEO_SITE_URL, so relative paths in openGraph.images resolve to absolute URLs automatically.

Sitemap + robots

Both are App Router dynamic routes:

  • apps/web/app/sitemap.ts — reads products + categories from the Woo Store API, adds static pages and box-builder templates. 1h revalidate.
  • apps/web/app/robots.ts — rule-based on SEO_INDEXABLE.

The sitemap intentionally re-fetches from Woo on each revalidate, so newly published products appear within the hour without a deploy.

Cutover checklist

When flipping DNS to the new PWA:

  1. Set SEO_INDEXABLE=1 on the production web.env.
  2. Restart the service; verify curl /robots.txt returns Allow: /.
  3. Verify curl /sitemap.xml returns >100 URLs (all products + categories + statics).
  4. Spot-check /p/mangosteen for <link rel="canonical"> and the two JSON-LD blocks.
  5. In Google Search Console: resubmit the sitemap and request indexing for home.
  6. Monitor Coverage report for crawl errors for 7 days.

Yoast passthrough

Product pages layer Yoast-authored metadata over the Woo defaults. The flow:

  1. The fruitplug-api WP plugin exposes GET /wp-json/fruitplug/v1/seo/product/{slug} (wp-plugin/fruitplug-api/includes/Rest/SeoController.php). It looks the product up via get_page_by_path, reads _yoast_wpseo_* postmeta directly (no WPSEO_* class dependency), and returns a flat JSON with title/description/canonical/og/twitter/robots/schema fields. Each field is null when Yoast hasn't authored it. The endpoint caches per-slug for 5 minutes via a transient and returns 404 only if the product itself doesn't exist.
  2. The PWA helper apps/web/lib/seo/yoast.ts (getYoastMeta(slug)) fetches that endpoint with next: { revalidate: 300 }. It never throws — any failure (no env, network error, non-2xx, malformed JSON) returns null.
  3. app/p/[slug]/page.tsx calls getYoastMeta alongside getProductBySlug inside generateMetadata. Yoast values win when present; Woo defaults (product name, stripped short_description, first image) fill the gaps. If Yoast sets robots.noindex === true, the page emits index: false regardless of SEO_INDEXABLE.

If Yoast is uninstalled the endpoint still returns 200 with all-null fields and the page renders identically to today.

Not shipped yet

  • hreflang. Single locale (en-GB) for now.
  • Image sitemaps. Product images are in the HTML; a dedicated image sitemap is a future optimization.
  • Review schema. AggregateRating is included when present, but individual Review objects aren't. Phase 2 task.
  • FAQ / HowTo schema on the preparation guide pages — once those land.