Run SEO audits locally with Gemma, Ollama, and React

Published May 7, 2026

a computer screen with a bunch of data on it

Most SEO audit tools send your page data to a cloud API. That means your metadata, heading structure, internal links, schema markup, and content details leave your machine every time you run a scan. For agencies, freelancers working under NDA, or anyone auditing a site before launch, that’s a real concern.

Developer Avraham Aminov built a different approach: Local AI SEO Agent, an open source tool that runs AI analysis entirely on your own hardware using Gemma through Ollama. No external AI provider. No per-token billing. No remote inference dependency.

Table of Contents

️ What It Does

The tool takes a public webpage URL, runs a deterministic technical SEO scan, sends a compact structured summary to a local Gemma model, validates the response, and renders a practical audit report in a React frontend.

The scanner extracts the following signals from any public page:

Title and meta description
Canonical, robots, and viewport tags
Heading structure (H1 through H6 counts)
Image alt text coverage
Internal, external, and empty link counts
Open Graph tags
JSON-LD schema count
Visible text length and word count

Gemma then takes those structured facts and produces:

An SEO score
A plain-language summary
Critical issues
Medium-priority issues
Actionable recommendations
A suggested title rewrite
A suggested meta description rewrite

How the Architecture Works

The separation between scanning and reasoning is the most interesting design decision here. Gemma never fetches a URL or parses HTML. The backend does all of that with Axios and Cheerio, then hands Gemma a compact structured summary rather than a raw HTML dump.

The full flow looks like this:

React UI
  -> Express API
  -> SEO scanner (Axios + Cheerio)
  -> prompt builder
  -> Ollama
  -> Gemma (gemma4:e4b)
  -> JSON validator (Zod)
  -> report UI

The frontend never communicates directly with Ollama. It only calls the Express backend. This keeps the Ollama service unexposed and makes the system easier to reason about.

The backend also rejects risky input before any fetch happens: localhost URLs, loopback IPs, private network addresses, malformed URLs, and unsupported protocols are all blocked. That matters because the backend is fetching user-provided URLs.

Why gemma4:e4b Specifically

Aminov chose gemma4:e4b because it’s stronger than the smallest edge variant while still being practical for local development. The model weighs in at 9.6GB. On his local Docker setup, a full audit takes around 1-2 minutes depending on whether the model is already loaded into memory.

The first request after a cold start is the slowest. Ollama needs to load the model before it can respond. The app handles this with an extended request timeout and a loading state that shows elapsed time so the user knows the process is still running.

Prompt Design and Output Validation

The prompt instructs Gemma to return JSON only. The required output shape is strict:

{
  "score": 92,
  "summary": "Short SEO summary",
  "criticalIssues": [],
  "mediumIssues": [],
  "recommendations": [],
  "suggestedTitle": "",
  "suggestedMetaDescription": ""
}

The backend validates every response with Zod before it touches the frontend. If Gemma returns malformed JSON, missing required fields, or an invalid score, the API returns a clean error instead of rendering unreliable data.

Aminov also trimmed the prompt by sending a scanner summary rather than the full raw scan object. Smaller prompts made local inference more predictable.

Running It Locally with Docker

The project runs with Docker Compose. Three services spin up: frontend, backend, and Ollama.

docker compose up -d --build

Default ports after startup:

Frontend: http://localhost:5174
Backend: http://localhost:3001
Ollama: http://localhost:11435

After the containers are running, pull the model into the Ollama service:

docker compose exec ollama ollama pull gemma4:e4b

That’s the full setup. No API keys, no accounts, no external services.

The Pattern Worth Stealing

The architectural lesson here isn’t specific to SEO. It’s a general pattern for using local models in developer tools:

Deterministic extraction + local AI reasoning + strict output validation

The scanner does the work that needs to be precise. Gemma does the interpretation work that benefits from natural language. Zod catches anything malformed before it reaches the UI. Each layer has a single, bounded job.

Aminov notes this works well because the task is clearly bounded. Gemma doesn’t need to browse the web or guess what’s on the page. It receives structured facts and focuses on interpretation.

The Catch

This is an MVP with intentional scope limits. It audits one page at a time. There’s no crawling, no sitemap support, no report history, and no PDF export. The developer lists multi-page crawling, Lighthouse integration, a browser extension, and a WordPress plugin as potential future additions, but none of those exist yet.

Performance is also hardware-dependent. A 9.6GB model running locally on a mid-range laptop will behave differently than on a machine with a dedicated GPU. The 1-2 minute audit time is from the developer’s own Docker setup.

Where to Get It

The project is open source on GitHub. MIT licensed, so you can fork it, modify it, or wire it into your own tooling.

Cagri Sarigoz( Founder )

I’m founder of BizStack at Cagri Sarigoz LLC and a passionate advocate for entrepreneurs.

With over 14 years in tech, marketing, and AI, including my role as Head of SEO at CitizenShipper and co-founder of TaleBot at Intale AI, I’m dedicated to sharing genuine, useful product insights and tips.

At BizStack, I aim to cut through the digital noise to provide clear, actionable advice.

And more than all else, I’m a father to a (always) little girl and a husband.

Contact me at [email protected] for assistance.