Most SEO audit tools send your page data to a cloud API. That means your metadata, heading structure, internal links, schema markup, and content details leave your machine every time you run a scan. For agencies, freelancers working under NDA, or anyone auditing a site before launch, that’s a real concern.
Developer Avraham Aminov built a different approach: Local AI SEO Agent, an open source tool that runs AI analysis entirely on your own hardware using Gemma through Ollama. No external AI provider. No per-token billing. No remote inference dependency.
️ What It Does
The tool takes a public webpage URL, runs a deterministic technical SEO scan, sends a compact structured summary to a local Gemma model, validates the response, and renders a practical audit report in a React frontend.
The scanner extracts the following signals from any public page:
- Title and meta description
- Canonical, robots, and viewport tags
- Heading structure (H1 through H6 counts)
- Image alt text coverage
- Internal, external, and empty link counts
- Open Graph tags
- JSON-LD schema count
- Visible text length and word count
Gemma then takes those structured facts and produces:
- An SEO score
- A plain-language summary
- Critical issues
- Medium-priority issues
- Actionable recommendations
- A suggested title rewrite
- A suggested meta description rewrite

How the Architecture Works
The separation between scanning and reasoning is the most interesting design decision here. Gemma never fetches a URL or parses HTML. The backend does all of that with Axios and Cheerio, then hands Gemma a compact structured summary rather than a raw HTML dump.
The full flow looks like this:
React UI
-> Express API
-> SEO scanner (Axios + Cheerio)
-> prompt builder
-> Ollama
-> Gemma (gemma4:e4b)
-> JSON validator (Zod)
-> report UIThe frontend never communicates directly with Ollama. It only calls the Express backend. This keeps the Ollama service unexposed and makes the system easier to reason about.
The backend also rejects risky input before any fetch happens: localhost URLs, loopback IPs, private network addresses, malformed URLs, and unsupported protocols are all blocked. That matters because the backend is fetching user-provided URLs.
Why gemma4:e4b Specifically
Aminov chose gemma4:e4b because it’s stronger than the smallest edge variant while still being practical for local development. The model weighs in at 9.6GB. On his local Docker setup, a full audit takes around 1-2 minutes depending on whether the model is already loaded into memory.
The first request after a cold start is the slowest. Ollama needs to load the model before it can respond. The app handles this with an extended request timeout and a loading state that shows elapsed time so the user knows the process is still running.
Prompt Design and Output Validation
The prompt instructs Gemma to return JSON only. The required output shape is strict:
{
"score": 92,
"summary": "Short SEO summary",
"criticalIssues": [],
"mediumIssues": [],
"recommendations": [],
"suggestedTitle": "",
"suggestedMetaDescription": ""
}The backend validates every response with Zod before it touches the frontend. If Gemma returns malformed JSON, missing required fields, or an invalid score, the API returns a clean error instead of rendering unreliable data.
Aminov also trimmed the prompt by sending a scanner summary rather than the full raw scan object. Smaller prompts made local inference more predictable.

Running It Locally with Docker
The project runs with Docker Compose. Three services spin up: frontend, backend, and Ollama.
docker compose up -d --buildDefault ports after startup:
- Frontend:
http://localhost:5174 - Backend:
http://localhost:3001 - Ollama:
http://localhost:11435
After the containers are running, pull the model into the Ollama service:
docker compose exec ollama ollama pull gemma4:e4bThat’s the full setup. No API keys, no accounts, no external services.
The Pattern Worth Stealing
The architectural lesson here isn’t specific to SEO. It’s a general pattern for using local models in developer tools:
Deterministic extraction + local AI reasoning + strict output validation
The scanner does the work that needs to be precise. Gemma does the interpretation work that benefits from natural language. Zod catches anything malformed before it reaches the UI. Each layer has a single, bounded job.
Aminov notes this works well because the task is clearly bounded. Gemma doesn’t need to browse the web or guess what’s on the page. It receives structured facts and focuses on interpretation.
The Catch
This is an MVP with intentional scope limits. It audits one page at a time. There’s no crawling, no sitemap support, no report history, and no PDF export. The developer lists multi-page crawling, Lighthouse integration, a browser extension, and a WordPress plugin as potential future additions, but none of those exist yet.
Performance is also hardware-dependent. A 9.6GB model running locally on a mid-range laptop will behave differently than on a machine with a dedicated GPU. The 1-2 minute audit time is from the developer’s own Docker setup.
Where to Get It
The project is open source on GitHub. MIT licensed, so you can fork it, modify it, or wire it into your own tooling.

