Google is still important. But in 2025, ChatGPT, Perplexity, Claude and Gemini are becoming the first point of contact between users and information. If AI spiders don't understand your site, you don't exist โ regardless of how well you rank on Google.
Analyze your site for free โTraditional search engines use crawlers that read HTML, follow links and analyze keywords. The new generation of AI spiders do something radically different: they try to understand the meaning of your content, not just the words.
To do this, they rely on structured data, semantic markup and open standards. A site without these elements is ignored or, worse, misinterpreted by the language models that millions of users query every day.
The result: technically correct sites, well-ranked on Google, but completely invisible to AI.
Just as robots.txt talks to crawlers and sitemap.xml describes the structure, the llms.txt file is the emerging standard to communicate directly with language models who you are, what you do and how you'd like to be cited.
You define how you want to be described by ChatGPT, Claude and Perplexity when they answer questions in your sector.
Language models use this information to correctly cite the source, increasing the likelihood that your site is mentioned in responses.
A simple text file. No plugins, no code. Just upload it to the site root or in /.well-known/.
Adopted by OpenAI, Anthropic and Perplexity as a voluntary AI optimization signal. The sooner you implement it, the sooner you benefit.
llms.click analyzes your site across 6 fundamental categories, each with a direct impact on AI visibility and search engine positioning.
Schema.org is the shared vocabulary among Google, Bing, Yahoo and Yandex to describe page content in a machine-readable way. AI spiders use it to build knowledge graphs and answer user questions.
Language models like GPT-4 and Claude read your HTML like a human reader would โ but are much more sensitive to structure. Hierarchical headings, clear paragraphs and alternative text are not optional: they're the grammar AI uses to understand who you are.
A well-formed XML sitemap is the most direct way to communicate to spiders โ both traditional and AI โ which pages you want indexed, with what priority and update frequency. The robots.txt file defines the access rules.
OGP metadata (ogp.me) control how your site appears when shared on social media, but also influence previews generated by AI assistants. Correct title, description and image increase CTR and model understanding of context.
W3C WCAG 2.1 guidelines are not just a moral obligation: in Europe they are a legal requirement for public bodies and large companies (EU Directive 2016/2102, EN 301 549). Google uses accessibility signals as a ranking factor. Inaccessible sites lose positions.
Google made Core Web Vitals an official ranking factor in 2021. LCP, FID/INP and CLS measure real user experience. High loading times increase bounce rate and reduce the likelihood that crawlers complete page scanning.
The European Union has introduced in recent years a series of regulations that directly impact the technical structure of websites. Ignorance is no excuse โ sanctions can be significant.
The General Data Protection Regulation requires explicit consent for profiling cookies, clear and accessible privacy notices, and DPO appointment for certain categories of controllers. Non-compliant sites risk fines up to 4% of global annual turnover.
The world's first AI regulation. Entered into force in August 2024, it imposes transparency obligations on AI-generated content (watermarking, disclosure). Sites using AI to generate content without disclosure risk growing sanctions from 2025.
Requires public bodies and large private companies to ensure WCAG 2.1 AA accessibility of their websites and apps. From 2025 it extends to new categories of private entities with fines up to 5% of turnover.
The Digital Services Act imposes transparency and accountability obligations on digital platforms. For sites with more than 45M EU users very broad obligations, but even smaller sites must ensure content reporting mechanisms and clear disclosures.
The Network and Information Security Directive extends cybersecurity obligations to many more categories of companies. It imposes minimum technical measures (HTTPS, vulnerability management, incident response) with fines up to โฌ10M or 2% of turnover.
The ePrivacy directive regulates the use of cookies and tracking technologies. The Italian DPA has issued numerous enforcement measures. The new ePrivacy Regulation, still under negotiation, will introduce even stricter rules on consent.
These are the main active AI crawlers in 2025. Each has a specific user-agent and different access policies. Your robots.txt must manage them knowingly.
| Bot | Company | User-Agent | Used for | Respects robots.txt |
|---|---|---|---|---|
| GPTBot | OpenAI | GPTBot/1.0 |
Training ChatGPT, browsing | Yes |
| ClaudeBot | Anthropic | ClaudeBot/1.0 |
Training Claude, ricerca | Yes |
| PerplexityBot | Perplexity AI | PerplexityBot/1.0 |
Risposta in tempo reale | Yes |
| Google-Extended | Google-Extended |
Training Gemini, SGE | Yes | |
| Applebot-Extended | Apple | Applebot-Extended |
Training Apple Intelligence | Yes |
| CCBot | Common Crawl | CCBot/2.0 |
Dataset aperti, molti LLM | Partially |
| Bytespider | ByteDance | Bytespider |
Training modelli TikTok | Partially |
| OAI-SearchBot | OpenAI | OAI-SearchBot/1.0 |
ChatGPT Search (live) | Yes |
Google penalizes sites with multiple URLs serving the same content without tags. PageRank disperses across versions and none ranks well.
Since 2018 Chrome marks HTTP sites as "not secure". Google has used HTTPS as a ranking factor since 2014. AI crawlers reject or penalize content from unencrypted sites.
LCP > 4s, CLS > 0.25 or INP > 500ms trigger Google's Page Experience penalty. Slow sites lose positions compared to faster competitors with similar content.
Rich results (review stars, FAQ, breadcrumbs in SERPs) require valid structured data. Incorrect markup is ignored or, worse, can lead to removal of rich snippets.
Images without alt attribute are invisible to AI spiders and image search engines. They also violate WCAG 2.1 and can expose you to legal sanctions for accessibility.
Without a sitemap, crawlers must discover pages by following links โ a slow and incomplete process. Orphan pages (without internal links) are never indexed.
Beyond the technical score, llms.click Gold queries real AI systems with questions in your site's language and measures your actual presence in AI-generated answers.
5 questions generated specifically for your site's topic and geographic focus โ not generic queries that any major site could answer.
Italian site โ Italian questions. French site โ French questions. AI answers are checked in the same language your audience uses.
Each query shows cited/not cited, the sources returned, and a snippet of the AI response. Fully actionable.