Auditing your LLM presence means querying AI engines with the right questions, checking what your robots.txt and llms.txt expose, validating your structured data, tracking your brand citations — then iterating. Most businesses skip all five steps.
90% of businesses don't exist for LLMs — and they don't even know it. While everyone obsesses over Google rankings, a parallel web is forming: one where users ask ChatGPT, Perplexity or Gemini a question and get a direct answer with two or three cited sources. If your brand isn't in those citations, you're invisible to a fast-growing segment of your audience. The problem is that most businesses have never even checked their LLM presence. They've optimized for Google, maybe for social — but nobody's asked the question: 'What does Claude say when someone looks for what we do?' This article is the exact playbook I walk my clients through. Not theory — a step-by-step audit you can start today.
Step 1 — Query the LLMs directly, with the right questions
blog.articles.auditer-sa-presence-dans-les-llms.content.section1
Step 2 — Analyze your llms.txt and robots.txt
Once you know where you stand in LLM responses, the next question is: can AI crawlers even access your site? This is where most teams are shocked. Start with your robots.txt. Open yourdomain.com/robots.txt and look for rules targeting GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider. If these bots are absent from your rules, their access depends on your default User-agent: * directive — and on whatever your CDN is doing upstream. If you're behind Cloudflare, go to Security → Bots in your dashboard. Since mid-2025, Cloudflare's 'AI Bot Block' feature is enabled by default on all plans. Millions of sites are blocking AI crawlers without having made a deliberate decision to do so. Now check your llms.txt. Open yourdomain.com/llms.txt. If you get a 404, you're missing one of the highest-leverage files you can create in under an hour. llms.txt is a Markdown file that introduces your organization to LLMs in a format they natively understand — who you are, what you do, your key pages, your contact. Think of it as a structured business card for AI. Full details on what to put in it and why it matters in this dedicated article on llms.txt and robots.txt. The combination of a permissive robots.txt and a well-structured llms.txt is the technical foundation of LLM visibility. Without it, even perfect content can remain invisible.
Step 3 — Check your structured data
Structured data is one of the most direct signals you can give to LLMs about who you are and what you offer. Schema.org JSON-LD isn't just for Google anymore — it's how AI engines parse the semantic context of your pages, rather than just their raw text. Start by auditing what's already in place. Open Chrome DevTools on your homepage, go to Sources and search for application/ld+json, or use Google's Rich Results Test. The critical schemas for LLM visibility: Organization (your name, description, URL, logo, contact), Person (if you're a consultant or founder building a personal brand), Service or Product (what you offer, with pricing if applicable), Article or BlogPosting (for content pages), and FAQPage (frequently underused — FAQs are goldmines for LLM citation). Research shows that adding relevant JSON-LD can triple an LLM's accuracy when summarizing your content. An LLM without structured data has to infer your identity from prose. An LLM with a complete Organization schema has a verified, machine-readable summary of who you are. Common mistakes I find in audits: schemas with empty or placeholder values, schemas that exist on the homepage but nowhere else, outdated information that contradicts the page content, and missing schemas on the pages that matter most — service pages and blog posts. Each schema gap is a missed opportunity for an LLM to accurately represent your brand.
Step 4 — Track citations and attributions
blog.articles.auditer-sa-presence-dans-les-llms.content.section4
Step 5 — Measure, iterate, and use the right tools
An audit is a snapshot. What you actually need is a monitoring loop. LLM citation landscapes shift — models get updated, new competitors publish content, your own site changes. The businesses that win at LLM visibility are the ones that treat it as an ongoing discipline, not a one-time project. Set up a monthly rhythm: re-run your benchmark queries, check robots.txt and llms.txt are still intact after any site deployments, verify structured data after CMS or template updates, and review any new third-party content mentioning your brand. For the technical checks, our free SEO & GEO audit tool covers the key automated signals in one pass: whether AI bots are allowed by your robots.txt, whether your llms.txt is present and well-formed, whether your structured data is valid and complete, and a range of additional GEO signals across 40+ criteria. It takes under 30 seconds and gives you a GEO score alongside your SEO score. Run it after any significant site change, and bookmark the results to compare over time. On the content side, the iteration loop is about filling gaps. When your benchmark queries show competitors being cited for a problem you solve, the answer is to create content that directly answers that question — factual, structured, with a clear answer in the first paragraph. LLMs extract from the beginning of content. Write for extraction, not just for reading. Finally, don't treat GEO as separate from your broader content strategy. The same content that gets cited by Perplexity also ranks in Google. The same structured data that helps ChatGPT understand your business also powers rich results. The investments compound.
Auditing your LLM presence isn't complicated — it's just something almost nobody is doing yet. That's the window. The businesses investing now in their AI visibility will capture citation share while competition is low, exactly as early SEO adopters dominated search rankings for years before their competitors caught on. If you want to go deeper on why this matters and the broader strategy behind it, read the full GEO article — it covers how AI engines choose their sources and the exact levers that drive citations. And if you'd rather have me run this audit for your business and turn it into a concrete action plan, let's talk.
Further reading
- GEO: Generative Engine Optimization — Princeton, Georgia Tech, Allen AI, 2023
- llms.txt — A proposal to standardise LLM-friendly site information
- Schema.org — Full Hierarchy
- Rich Results Test — Google Search Console
- Dark Visitors — A list of known AI agents on the internet
- Robots.txt Specification — Google Search Central
