AI Bots and User Agents
AI crawlers identify themselves through user-agent strings. Keeping those strings current in your robots.txt lets you guide how language models interact with your work.
https://momenticmarketing.com/blog/ai-search-crawlers-bots
Robots.txt in the Age of AI: Which Bots to Allow (and Why It Matters) - tiptop SEO Agency
C.L.A.R.I.T.Y.: The AI Optimization Framework Your Brand Needs - tiptop SEO Agency How Perplexity Crawls and Indexes Your Website
List of AI Bots
ai-robots-txt/ai.robots.txt: A list of AI agents and robots to block.
This is a repository with a block list that can be used as a whitelist of course.
Vendor Pages
Overview of OpenAI Crawlers - OpenAI API Perplexity Crawlers - Perplexity