1. Overview

Agent discovery infrastructure is fragmenting into four overlapping standards, each solving a different layer of the "how does an autonomous system find and trust a service" problem. The current state is pre-consolidation: llms.txt has emerged as a lightweight content-pointer file, OpenAPI 3.1 remains the de facto machine-readable contract for REST endpoints, agents.json (a Wildcard AI proposal) attempts to layer agent-specific semantics on top of OpenAPI, and semantic HTML (Schema.org, microdata, JSON-LD) continues to serve as the fallback for sites that haven't published explicit agent metadata. None of these formats yet encode the trust, pricing, and freshness signals that production agent fleets actually require before integrating a new service.

2. Key findings

  • llms.txt is the lowest-friction discovery primitive. Proposed by Jeremy Howard in September 2024 (https://llmstxt.org), the spec defines a Markdown file at /llms.txt containing a curated index of a site's most LLM-relevant content, with an optional /llms-full.txt for the full corpus. Adoption to date appears concentrated among developer-tooling and documentation vendors, though to our knowledge no systematic census of deployment exists. The format is human-curated, not auto-generated, and contains no machine-readable trust or pricing signals — it is purely a content discovery aid.

  • OpenAPI 3.1 is the only widely-machine-consumed contract format. Anthropic's Model Context Protocol (MCP, https://modelcontextprotocol.io) and OpenAI's function-calling interface both consume JSON Schema — the schema language at OpenAPI 3.1's core. This makes OpenAPI the de facto execution-time discovery format even when it is not the publication-time discovery format.

  • agents.json layers commerce semantics on OpenAPI. The agents.json proposal (Wildcard AI, https://docs.wild-card.ai/agentsjson/introduction) extends OpenAPI with flows (multi-step sequences an agent can execute) and explicit authentication chains. Adoption is early. A speculative reading: the design pattern — augmenting OpenAPI with agent-task semantics rather than replacing it — is the direction the ecosystem appears to be converging on.

  • Semantic HTML remains the universal fallback. Schema.org JSON-LD blocks, Open Graph tags, and microdata predate the agent era and have the broadest parser support of any structured-data format on the web; crawler-based agents inherit that support. Sites that publish neither llms.txt nor OpenAPI are still discoverable, but discovery quality collapses to whatever the agent's general HTML-parsing capability extracts. For structured-data domains (products, articles, organizations, datasets) the Schema.org vocabulary already covers most need; for API services it covers almost none.