One for all the models out there!

TL;DR

Today we’re releasing a new MCP server Canarytoken on canarytokens.org to detect agentic attackers riffling through your systems. This token generates mcp.json files that are used by LLM development toolchains such as VSCode, Cursor, Claude Code, and Copilot CLI. The tokens can be configured to alert either on connection to the MCP server or to alert only when an offered MCP tool is called. Head over to canarytokens.org and give it a shot!

Introduction

Unless you’ve been under a rock for the last year or two, it’s impossible to avoid the breathless predictions of impending LLM cyber attack. Our new MCP token aims to give AI-native attacks a simple AI-native detection: an MCP server that alerts when used.

MCP, or model context protocol, is a standardized protocol for LLM agents to access and consume tools, resources, and specialized prompts. While the hip vibe-coders have largely moved to Skills, MCP servers are still a popular way to consume APIs and operate tools with agents. MCP clients, such as Claude Code or VSCode Copilot, connect to local or remote servers that speak a standard I/O or JSON-RPC format, respectively. These servers offer a standard set of endpoints to provide the offered functions and data, and help manage data formatting between the models and the backend tools.

Very roughly, an MCP client will first connect to an MCP server and fetch a list of possible tools. (That’s one detection opportunity). Once an MCP client has a list of tools that are available, it may choose to call one of them. This is the second detection opportunity we expose, when a specific tool is called.

Deception is a great way to detect automated attacks: by inserting juicy-looking data into places where only agents will look, it’s possible to get a strong signal of agentic presence. As models have evolved, they’ve grown their context windows, so will gladly ingest any data they come across. With the option to only alert on tool calls, the token can further hone in on intent and reduce noise by only alerting when a client is trying to access tools no one should be using. Our MCP server offers a variety of tools that look like they offer compelling credentials to other services.

If you’re using MCP already, we suggest configuring the token to alert on tool calls (and adding the token server entry to your existing MCP configuration without enabling it). This will catch agents (or their operators) trying to gain access to the tools’ [fake] credentials. If you’re not using MCP, alerting on connections can provide early warning that your repositories are being accessed by agentic tools that pick up the mcp.json file.

Using a similar alerting model, Harshad Sadashiv Kadam used our Kubernetes, DNS and Web Bug tokens to create an AI honeypot that would alert only after a client started using the results from tool calls (check it out!). Our approach is different in that you don’t need to deploy any infrastructure, you just need to add (or edit) your mcp.json file.

Giving it a go

Over on canarytokens.org, you can see the MCP token in the list of available options. Select that tile and you’ll see the following creation modal:

Here you can choose to have the token alert on either a valid connection to the /mcp endpoint with the generated credential, or only when the client establishes a session, enumerates the available tools, and attempts to call one. As a reminder, “Connection” is useful when you don’t use MCP at all, and “MCP Tool Call” is useful when you do already use MCP but want to include decoy entries. Select the alerting option best-suited for your placement, put in an email address for alerts, and a good reminder of where you’ve placed the configuration file. You’ll be presented with your very own MCP configuration JSON!

Screenshot of a generated MCP JSON configuration entry.

You can either download it as a standalone JSON file, or copy this token and add it to an existing configuration (i.e. copy the “cloud-auth-broker” object from the generated Canarytoken into your own MCP configuration file). This file can go into a private repository or code directory to provide an alert if it is ever accessed by an overly-curious agent. We recommend using the same directory structure as popular AI development tools store this file:

Toolchain	Workspace-relative path
Claude Code	`.claude/mcp.json`
Copilot CLI	`.copilot/mcp-config.json`
Cursor	`.cursor/mcp.json`
VSCode	`.mcp.json`
VSCode (legacy)	`.vscode/mcp.json` (the `mcpServers` key must be renamed to `servers`)

If an agent consumes this file and connects to (or, depending on your token config, calls) the MCP server you will get an alert:

Example of an alert triggered by an MCP client connecting to the token

This alert provides the source IP of the MCP client, the MCP method called, and if a tool was called, the name of the tool (in this case fetch_kubeconfig). Based on the memo, you now have a high confidence signal that the location of this MCP configuration has been accessed.

A helpful model bonus

MCP now allows for additional checks to ensure that an MCP server is trusted before it is accessed (which could pollute a consumer’s context window or be used to spawn arbitrary code). Organizations can even disable MCP entirely in their GitHub Copilot settings–in theory preventing VSCode or Copilot CLI from triggering the token.

Fortunately for our token (and an unfortunate headache for those trying to limit the risk added by AI development tools), models will happily work around this. As a “helpful coding assistant”, an agent in VSCode chat, once it’s realized MCP is disabled, will gladly access the mcp.json file, and generate and execute curl commands to interface with it. This means the token can still provide value even in environments that disable MCP. If your organization restricts MCP, by putting these configuration files into your repositories you can get an alert when an agent tries to go around the restrictions.

Here’s an example of a VSCode agent chat session where MCP is disabled by organizational policy–and the agent still enumerates the tools:

Screenshot of a Copilot chat where the model generates command-line calls to access the MCP server despite having MCP disabled.

Simply asking the agent to call one of these tools succeeds, and triggers an alert:

Conclusion

While the jury may still be out about the end-result of agentic attackers, we think deception will always be a powerful way to either detect badness or stymie attackers (human or machine) in achieving all their objectives out of fear of detection. If you’re worried about agents tearing through your repos or filesystems, the MCP Canarytoken is a new tool to detect them. We encourage you to take the new MCP token for a spin (along with our other token types)!

One for all the models out there!

TL;DR

Introduction

Giving it a go

A helpful model bonus

Conclusion

Related

Related Posts:

Swipe right on our new credit card tokens!

Leave a ReplyCancel reply

One for all the models out there!

TL;DR

Introduction

Giving it a go

A helpful model bonus

Conclusion

Related

Related Posts:

Swipe right on our new credit card tokens!

Leave a ReplyCancel reply

Discover more from Thinkst Thoughts