FAQ for Technical SEO/GEO Professionals
What we deploy, how it interacts with your existing SEO, and what it actually achieves.
What FAIND is
What does FAIND actually deploy?
A parallel, machine-readable knowledge layer for your brand — a brand-specific database for LLMs, built from a copy of your website. It lives on llms.yourdomain.com (via CNAME) or a hosted subdomain, and contains entity-structured pages, intent-based Q&A clusters, and JSON-LD that matches the visible content one-to-one. Your main site, CMS, templates, and existing structured data are untouched. Setup is a DNS record, not a project.
Why a parallel layer instead of optimizing the main site?
Because your website can't be perfect for humans, SEO, and AI at once. Pages tuned for conversion and brand are rarely the ideal retrieval target for an LLM, and restructuring them for machines means design fights, CMS migrations, and SEO trade-offs. The layer lets your site keep doing its job for humans and classic search while machines get a representation built for how retrieval-augmented systems actually consume content: explicit entities, direct question/answer structure, clean factual density.
Is this just GEO rebranded, and how is GEO different from SEO?
Yes, this is Generative Engine Optimization. SEO optimizes for ranked links in a SERP; GEO optimizes for what AI systems say — whether your brand is mentioned, your domain is cited as a source, and your product is recommended inside generated answers in ChatGPT, Claude, and Google AI Overviews. The retrieval mechanics differ (RAG over search indexes plus the model's parametric knowledge), so the inputs and the KPIs differ.
Isn't this just llms.txt?
No. llms.txt is a discovery pointer with limited confirmed adoption by the major consumer engines. We ship actual crawlable content — full HTML plus structured data that retrieval systems can index and ground on — together with monthly measurement. The layer is discoverable through standard means (sitemaps, links) and is compatible with llms.txt, not dependent on it.
How it interacts with Google and your existing SEO
Is this cloaking?
No, and this is a hard architectural invariant, not a policy preference: same bytes for every requester. No user-agent, IP, or referrer branching on content; no sneaky redirects; no bot-only overlays. The canonical tag a page carries is identical for Googlebot, GPTBot, and a human in Chrome. Patterns like “redirect humans, serve bots the optimized page” were explicitly evaluated and rejected because they put the customer's domain at manual-action risk.
Doesn't a parallel layer create duplicate content — or worse, cannibalize my rankings?
This is the right question, and it's handled with a per-URL canonical policy. Any layer URL that would compete with an existing page on your site carries rel="canonical" pointing at your original, and is simultaneously removed from the layer's sitemap and IndexNow submissions — we never tell an engine “index this” while the tag says “credit that.” Purely additive URLs (topics your site doesn't cover) stay self-canonical. Google and Bing consolidate signals to your domain. The fail-safe direction is always self-canonical, never an unverified target.
When exactly does a layer page canonicalize to my site — and when not?
The principle is deliberately simple, and you can verify it in any crawl: a layer URL that covers the same intent as an existing page on your site — and would compete with it in search — carries a canonical to your original and is withheld from the layer's sitemap and IndexNow submissions. A layer URL covering ground your site doesn't cover stays self-canonical: there is nothing to consolidate and no equivalent target. Every ambiguous case resolves to self-canonical, because a wrong canonical is worse than none.
What we don't publish are the decision mechanics — which search-performance signals classify a URL as competing, the thresholds, and the validation pipeline that re-evaluates every decision as your data changes. That's the operational core of the product. What matters for governance is this: every canonical is a deterministic, per-URL entry in a single decision table — visible in page source, logged, auditable, and reversible — never an ad-hoc edit. Your SEO team can crawl the layer and audit every decision against this rule at any time, and we'd encourage exactly that.
If layer pages canonicalize to my site, what's left for the AI engines?
The content. A canonical is a consolidation hint for search indexers; AI crawlers fetch and read page content regardless of the tag. So the layer remains fully readable input for LLM systems while ranking and citation credit route to your domain — which is exactly the point. Your domain is the brand entity; it should collect the credit. A vendor whose own subdomain accumulated your citations would be a problem, not a feature.
Is this scaled content abuse under Google's spam policies?
No. The layer is derived from your own substantive content — restructured, not spun — and the surface that could compete in ranked search is canonicalized to your originals and withheld from index submission. It isn't built to rank in classic SERPs at your expense; it's built to be retrieved and read by answer engines, with search-ranking credit deliberately consolidated to you.
Does FAIND ever touch my actual site?
Only one thing, and only opt-in: for specific pages we can add a visible FAQ section plus matching FAQPage JSON-LD, generated from the same Q&A content — with explicit consent and a preview first. Visible content and schema are two renderings of one source; we never inject schema-only claims a human can't see. If you never opt in, your site is never modified.
What about crawl budget and index bloat?
The layer is a separate host, so it doesn't consume your domain's crawl budget. Canonicalized URLs are excluded from the layer's own submission surfaces, so we're not inflating anyone's index with URLs we've told engines to consolidate away.
What happens if we cancel?
You remove the CNAME and the layer is gone. Your site never depended on it. Any opt-in enrichment content lives on your pages and stays yours. There is no lock-in mechanism by design.
Do AI systems actually use it?
Which crawlers can access the layer?
All of them, by policy: GPTBot, OAI-SearchBot, ClaudeBot, Google-Extended, PerplexityBot, Bingbot, and the rest. No user-agent discrimination exists anywhere in the stack — it can't, per the no-cloaking invariant — and crawler access is observable in server logs.
Do AI answers cite the llms.* pages directly?
That's not the goal, and it's not how we score success. The architecture deliberately consolidates credit to your domain; we measure whether your brand and your domain get mentioned, cited, and recommended in AI answers — not whether our infrastructure host appears in a source list. The layer is an input the systems read; your domain is the entity that should win the answer.
How do you know the engines are reading it rather than ignoring it?
Three independent observables: crawler hits on the layer in server logs, indexation of the layer's additive pages by the search indexes that feed retrieval, and — the one that matters — movement in the answer-level KPIs on tracked queries, measured per engine, month over month. We treat the last one as the only real proof and report it as such. We have also verified instances of Google's AI Overviews citing layer pages directly as answer sources — not the goal (your domain should collect that credit), but a clean existence proof that the layer is retrieved and grounded on, established by resolving Google's own grounding redirects rather than by inference.
Security, data protection & enterprise operations
What can the snippet technically do — and what can't it?
One line, two jobs: detect content changes and measure AI-driven visits. It sets no cookies and no cross-site trackers, reads no form inputs, ships no third-party tags, and adds nothing to page load; under a strict CSP it fails closed — the snippet keeps working. The standing invitation applies here too: the script loads in the open — have your security team read it.
An automated system publishes under our brand. Who controls what it says?
The generation is grounded: layer pages are built from facts on your own site, not free-form model output — if your site doesn't support a claim, the layer doesn't make it. Thin topics aren't published at all. Structured data mirrors visible content one-to-one, every page is deterministically re-rendered from source, and every change is logged and reversible. You can crawl the layer and audit any page against your own source at any time.
What's the blast radius if FAIND has an outage — or a breach?
Architecturally small, and deliberately so. Your main site has zero runtime dependency on FAIND: if the layer is unreachable, your site doesn't notice. FAIND holds no credentials to your systems — no CMS logins, no API keys, no database access — because the product never needed them; there is nothing of yours to leak that isn't already public. The one piece of FAIND code on your pages is the snippet, which is exactly why its scope is locked down as described above.
Measurement and results
How do you measure impact?
Standardized monthly query panels per engine — ChatGPT, Claude, Google AI Overviews — run as unshaped baselines and scored on three distinct KPI classes: mentions (your brand is named in the answer), citations (your domain is linked as a source — including correct resolution of engine-specific quirks like Google's grounding redirect URLs, which naive parsers misattribute), and recommendations (your product is advised as the answer). Reported per provider and per query, against the pre-engagement baseline.
What does it actually achieve?
Two things. First, the measured lifts: across customers we currently see an average ~7× improvement in AI visibility on tracked commercial queries, with per-engine breakdowns available in our case studies. Second, the structural shift those numbers come from: on most commercially relevant queries today, 90%+ of AI-answer citations go to third parties — directories, publishers, aggregators, competitors — not to the brand being asked about. We measure exactly that gap for your queries, then close it by making your domain the source the engines ground on.
How fast does it work?
Honestly: indexing of the layer happens within days to weeks depending on the engine; answer-level effects accrue over weeks to months as retrieval indexes refresh and the monthly panels re-run. We don't promise fixed timelines — we show you the monthly per-engine deltas and let the trendline make the argument.
Can I verify any of this myself?
Yes, and you should. Query the engines for your category's commercial questions and look at who gets mentioned and which domains get cited. Crawl the layer yourself — same bytes you'd see as any bot. Check the canonical tags, the sitemap exclusions, the JSON-LD-to-visible parity. The whole approach is built to survive exactly this inspection.
Looking for the plain-language version? See the buyer FAQ — or get your free AI Visibility Audit to see where AI puts you today.