SEO & AEO13 March 202611 min readThe AI Prompt Architect Team

Optimizing React SPAs for AI Web Scrapers (GPTBot & ClaudeBot)

Your React Single Page Application (SPA) might look beautiful in the browser, but to AI web scrapers like GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended, and PerplexityBot, it's a blank HTML shell with a single <div id="root"></div> and a JavaScript bundle URL.

These bots don't execute JavaScript. They see only your initial HTML response. If your content, meta tags, and structured data are injected client-side by React, AI models cannot index your content.

The Client-Side Rendering Problem

A typical React SPA (built with Vite or Create React App) serves this initial HTML:

<!DOCTYPE html>
<html>
  <head>
    <title>My App</title>
  </head>
  <body>
    <div id="root"></div>
    <script src="/assets/main.abc123.js"></script>
  </body>
</html>

Everything else — page content, meta descriptions, JSON-LD, Open Graph tags — is injected by JavaScript after the bundle loads. AI scrapers receive only the empty shell above.

The result: your pages don't appear in AI-generated answers, Google AI Overviews can't cite your content, and ChatGPT with browsing reports that your page has no relevant content.

Solution 1: react-helmet-async for Meta Tag Management

react-helmet-async is a React library that manages the document <head>. While it still relies on client-side rendering, it ensures that meta tags are injected consistently and can be pre-rendered by server-side solutions.

Key implementation details:

  • Wrap your app in <HelmetProvider>
  • Use the data-rh="true" attribute to prevent duplicate tags — the library will manage deduplication
  • Set the same data-rh="true" on your static HTML fallback tags so Helmet replaces them cleanly
import { Helmet } from 'react-helmet-async';

const SEO = ({ title, description, canonicalUrl }) => (
  <Helmet>
    <title data-rh="true">{title}</title>
    <meta data-rh="true" name="description" content={description} />
    <link data-rh="true" rel="canonical" href={canonicalUrl} />
    <script type="application/ld+json">
      {JSON.stringify({
        "@context": "https://schema.org",
        "@type": "WebPage",
        "name": title,
        "description": description
      })}
    </script>
  </Helmet>
);

Important: react-helmet-async alone does not solve the AI scraper problem because it still requires JavaScript execution. You need to pair it with one of the pre-rendering solutions below.

Solution 2: Pre-rendering with Headless Browsers

Pre-rendering services like Prerender.io or self-hosted Puppeteer/Playwright instances detect bot user agents and serve a fully rendered HTML snapshot instead of the SPA shell.

The flow works like this:

  1. A bot requests /blog/my-article
  2. Your server detects the bot user agent (GPTBot, ClaudeBot, Googlebot, etc.)
  3. Instead of serving the SPA shell, the server sends the request to a headless browser
  4. The headless browser renders the React app, waits for content, and captures the final HTML
  5. The fully rendered HTML (with all meta tags, JSON-LD, and content) is returned to the bot

User Agent Detection

The most common AI bot user agents to detect:

  • GPTBot — OpenAI's web crawler for ChatGPT
  • ChatGPT-User — ChatGPT browsing mode
  • ClaudeBot — Anthropic's web crawler
  • PerplexityBot — Perplexity AI's crawler
  • Google-Extended — Google's AI training crawler
  • Googlebot — Standard Google search crawler
  • Bingbot — Microsoft Bing crawler

Solution 3: Edge Functions for Dynamic Meta Tags

For hosting platforms like Firebase Hosting, Vercel, or Cloudflare Pages, you can use edge functions to intercept requests and inject meta tags and JSON-LD into the HTML response before it reaches the client.

On Firebase Hosting, this is done via firebase.json rewrites that route specific paths to a Cloud Function. The function reads the request path, looks up the page metadata, and injects it into the HTML template before returning the response.

// Firebase Cloud Function (simplified)
exports.seo = functions.https.onRequest((req, res) => {
  const path = req.path;
  const metadata = getMetadataForPath(path);

  const html = baseHtml
    .replace('__TITLE__', metadata.title)
    .replace('__DESCRIPTION__', metadata.description)
    .replace('__JSONLD__', JSON.stringify(metadata.jsonLd));

  res.status(200).send(html);
});

Solution 4: Static Site Generation (SSG) for Key Pages

If your React SPA uses a build tool like Vite, you can pre-render critical pages at build time using plugins like vite-plugin-ssr or vite-ssg. This generates static HTML files for your most important pages (homepage, blog posts, product pages) while keeping the SPA experience for dynamic routes.

Our Approach at AI Prompt Architect

AI Prompt Architect is a React SPA built with Vite and react-helmet-async. We solve the AI scraper problem using a combination of:

  • Static fallback meta tags in index.html with data-rh="true" so they're available before JavaScript loads
  • JSON-LD injection via our unified <SEO> component on every page
  • A comprehensive sitemap.xml with all 161+ URLs for crawler discovery
  • Proper robots.txt that allows GPTBot, ClaudeBot, and all major crawlers

The result: our pages are cited in AI-generated search results and our structured data validates in Google's Rich Results Test. See it in action — our entire platform is a living example of React SPA optimisation for AI scrapers.

React SPAGPTBotClaudeBotAI scraperspre-renderingreact-helmet-asyncJSON-LDedge functionsSSR

Related Articles

Explore Guides

Ready to build better prompts?

Start using AI Prompt Architect for free today.

Get Started Free