What Is WebMCP? AI Agent Actions Explained (2026)

Summary

WebMCP is the draft browser API letting sites expose tools (search, cart, booking) to in-browser AI agents. The spec, who invokes it today, and how to ship it.

Contents

Key facts


The read-only era of AI on the web is starting to give way to a read-and-do era. AI agents have spent the last year fetching pages, parsing content, and summarizing what they find — but stopping short of clicking, submitting, or buying anything on the user's behalf. WebMCP is the proposed browser API that lets a site offer those action surfaces to an agent that knows how to ask.

The honest framing for mid-2026: WebMCP is real as a spec, prototyped in Chromium-based browsers and in agent-first browsers like Perplexity Comet, and actively used by a growing set of browser extensions and custom-built agents. It is not yet how ChatGPT and Claude's first-party apps operate — those still use citation rendering or screen-control. So adding a WebMCP snippet today is a forward investment: you become invocable by the WebMCP-aware agents that exist now, and you're ready when the larger consumer agents add support.

This is the explainer. What the spec does, who invokes it today vs who doesn't, what an agent action looks like, the safety model, and how to add it to your site without rewriting anything.

What WebMCP actually changes — the shift from "read" to "do"

For the past year, the AI-on-the-web playbook has been about being readable. Ship llms.txt. Make sure your meta descriptions are clean. Render server-side so agents don't choke on JavaScript. Optimize for citation.

WebMCP moves the goal post. It lets your site register tools — JavaScript functions with structured inputs and outputs — that an in-browser AI agent can invoke. A tool can be anything: searchProducts(query), addToCart(sku, qty), requestQuote(name, email, project), bookAppointment(slot, contact). A WebMCP-aware agent reads the tool catalog, decides which one matches the user's intent, and calls it. The browser shows the user a confirmation. The action happens.

llms.txt made you readable. WebMCP makes you actionable, for agents that know how to act. Different layer, different upside, different timeline on adoption.

The spec in four sentences

  1. navigator.modelContext is the entry point. A browser-provided object that exposes registerTool(), unregisterTool(), and a tool registry.
  2. A tool is a JSON Schema + a handler function. The schema describes the inputs (and the expected output). The handler is your normal site code — it runs in your page's JavaScript context with your normal session, cookies, and APIs.
  3. A WebMCP-aware AI agent reads the registered tools and invokes them. The agent has to be implemented against the API — not every browser-resident agent is.
  4. The browser renders a confirmation UI for the user before the tool runs. The site does not write its own consent dialog — the browser owns it, which is what makes the trust model work.

That's the whole API surface a developer needs to think about. The complexity is on the browser and agent side, where the integration, sandboxing, and consent UI live.

Who actually invokes WebMCP today (and who doesn't)

This is the section most WebMCP coverage skips. The honest reality for mid-2026:

Agents that invoke WebMCP today (small but real)

Agents that don't invoke WebMCP today (most consumer flows)

Browser-side support

Chromium-based browsers expose navigator.modelContext behind a flag or origin trial in current builds. Safari and Firefox have been evaluating but have not shipped. Stable, default-on, every-browser support is still ahead.

Why ship the snippet anyway

Three reasons, in order of immediate vs eventual return:

  1. Today (small but real): the agents listed above can invoke your tools right now. If your customers use Comet, an agent extension, or a custom buying agent, you become actionable to them.
  2. Within 6-12 months (the realistic adoption window): as the spec stabilizes and consumer agents add WebMCP support, sites that already registered tools start showing up as the actionable choice. Being early in directory listings of WebMCP-enabled sites matters.
  3. As the better path than screen-control: agents that don't use WebMCP today rely on screen-control, which is messy, slow, and breaks when you change your CSS. If you make yourself easy to invoke via the API, agents shift to it because it's more reliable. You're shaping the path of least resistance.

What an agent action looks like — three examples

Specs are abstract. Concrete examples are not. Here's what three flows look like end-to-end, when invoked by a WebMCP-aware agent. (Important context: these scenarios run today on Comet, agent extensions, and custom agents — not yet on first-party ChatGPT or Claude.)

Example 1 — Product search

User in Comet: "I'm looking for a wireless dog fence for a 2-acre yard, around $300."

  1. Comet navigates to a pet-supply site that has registered a searchProducts tool.
  2. Comet reads the tool schema: inputs are query, category, maxPrice, features; output is an array of products with name, URL, price, and short description.
  3. Comet calls searchProducts({ query: "wireless dog fence", maxPrice: 300, features: ["2-acre range"] }).
  4. The browser shows: "Comet wants to search products on petsupplies.com. Allow?" User clicks allow.
  5. The tool runs your normal product search code, returns three results.
  6. Comet renders the three results inline in its chat, with your product names and links.

That's a faster, better path than "the agent scraped your category page and guessed." You control the ranking, the description, and what the agent shows.

Example 2 — Booking an appointment

User in a Chrome agent extension: "Find me a roof inspection appointment in Dallas next Tuesday morning."

  1. The agent lands on a roofing company's site with a registered getAvailableSlots and bookAppointment tool pair.
  2. The agent calls getAvailableSlots({ city: "Dallas", date: "2026-06-09", timeOfDay: "morning" }). Browser confirms. Tool returns three slots.
  3. Agent tells the user the three options. User picks one.
  4. Agent calls bookAppointment({ slot, name, phone, email }). Browser confirms with the user, showing the details to be submitted.
  5. The tool runs the actual booking transaction. Returns a confirmation number.

For local-service businesses, this is the kind of flow that will eventually move from "the agent surfaces a phone number" to "the agent books the appointment." The infrastructure to do it cleanly exists; the consumer-agent invocation that drives volume is still catching up.

Example 3 — Lead capture / quote request

User in a custom enterprise buying agent: "Get me a quote for a 1,500 sqft kitchen remodel in Plano."

  1. Agent lands on a remodeler's site with a requestQuote tool registered.
  2. Agent calls requestQuote({ projectType: "kitchen remodel", squareFeet: 1500, location: "Plano, TX", name, email, phone }).
  3. Browser confirms with the user, who reviews the data being submitted.
  4. Tool runs your existing lead-capture logic — writes to CRM, fires email, triggers Slack notification.

The agent did the work the user would have done by filling out a form. The form code on the page didn't change.

The safety model: who gets to do what

The reason WebMCP can be shippable in a browser without a thousand abuse vectors is the consent model. Three things hold the line:

1. In-browser confirmation, not in-page

The page cannot draw its own consent dialog. The browser renders the confirmation — same trust surface as a permission prompt for location or camera. A malicious page can't fake an "allow" click.

2. Per-invocation, not blanket consent

Approval is per-tool-per-action by default. Users can mark a specific tool as "always allow on this site" but that's a deliberate setting, not a one-time-blanket-OK. The default is: every call shows a prompt.

3. Payment and authentication are carved out

The spec explicitly forbids tools that take credit-card or password fields. The browser refuses to invoke them. Payment integrations (Stripe, PayPal, Apple Pay, Shop Pay) work by handing the agent a "ready-to-pay" URL that the user has to click through. The agent assembles the cart; the human authorizes payment.

That carve-out is what makes "agentic commerce" not terrifying. The agent shops, the user buys.

Setting it up: one script tag, the tool packs

You can write WebMCP integrations from scratch using the raw navigator.modelContext.registerTool() API. For most sites, that's not the right play — you'd be writing the same five or six tools (search, add-to-cart, checkout-handoff, book-appointment, request-quote, lookup-order) that every other site is writing.

The cleaner pattern is a tool pack: a hosted snippet that ships a library of common tools, each backed by a config block where you wire it to your existing site code or platform API. One tag, the tools register, the WebMCP-aware agents that visit your page can invoke them.

For ecommerce on Shopify, BigCommerce, or WooCommerce, the snippet auto-wires search, cart, and order-lookup to your platform APIs without any custom code. For service businesses on Calendly, Acuity, or HubSpot, booking and lead-capture wire to those. For custom apps, you point each tool at the function or endpoint it should call.

Crawlytics' Commerce tier ($49.99/mo) ships exactly this pattern — a single tag, a config block, and a dashboard that shows which agents invoked which tools when. The math: integrating yourself is doable, but the spec is still evolving and rolling your own means tracking those changes manually.

What this means for conversion attribution

When a WebMCP-aware agent takes an action on your site, you want to know which agent, which session, and whether the action converted. Otherwise the channel is a black box.

The emerging attribution pattern:

This connects back to the broader AI-attribution problem. ChatGPT, Claude, and Perplexity in-app browsers already strip the Referer header on outbound clicks — most sites are losing AI referral attribution to "(direct)" in Google Analytics. We covered the fix in our piece on ChatGPT direct traffic. WebMCP attribution is the same problem at a different layer.

Does WebMCP replace llms.txt?

No, and the order of investment matters: ship llms.txt first. The audience that benefits from llms.txt — every AI client that fetches your pages — is much larger today than the audience that invokes WebMCP. WebMCP is the next layer on top.

llms.txt tells an agent what is on your site — the catalog, the order, the descriptions. WebMCP tells the agent what it can do on your site — the actions, the inputs, the outputs.

A WebMCP-aware agent picking between two sites will use llms.txt to read them and WebMCP to act on the one that lets it complete the user's task. If you have llms.txt but not WebMCP, the agent reads you and refers the user back to manual action. If you have both, the agent reads you and (if it's a WebMCP-aware one) completes the task. The llms.txt setup guide is here — ship that first, then come back for WebMCP.

Where this leaves you

WebMCP is the next layer of the AI-readiness stack — the layer where agents stop referring users and start completing tasks. But the realistic 2026 picture is that adoption is in early-prototype phase. Today's WebMCP invokers are a small set: Comet, agent extensions, custom buying agents. The major consumer agents (ChatGPT, Claude in their first-party apps) use other approaches today and may take 6-12 months or more to add WebMCP support.

So the honest framing: adding the snippet is a forward investment, not a 2026 conversion engine. The integration is small. The upside compounds as adoption grows. The downside is zero — on browsers without WebMCP support, the registration call no-ops and the page renders normally.

If you're already shipping llms.txt, WebMCP is the natural next layer. If you're not, ship that first.

Related

Written by Crawlytics Team. Crawlytics tracks AI bots, generates llms.txt, and powers WebMCP commerce, all from one snippet on any stack. See how it works →

Frequently Asked Questions

Does WebMCP replace llms.txt?

No, and the order of investment matters: ship llms.txt first. The audience that benefits from llms.txt — every AI client that fetches your pages — is much larger today than the audience that invokes WebMCP. WebMCP is the next layer on top. llms.txt tells an agent what is on your site — the catalog, the order, the descriptions. WebMCP tells the agent what it can do on your site — the actions, the inputs, the outputs. A WebMCP-aware agent picking between two sites will use llms.txt to read them and WebMCP to act on the one that lets it complete the user's task. If you have llms.txt but not WebMCP, the agent reads you and refers the user back to manual action. If you have both, the agent reads you and (if it's a WebMCP-aware one) completes the task. The llms.txt setup guide is here — ship that first, then come back for WebMCP.

Cite this page

Related on this site


This page is part of Crawlytics.app. View all pages: llms.txt · llms-full.txt

Site index for AI agents: llms.txt · sitemap