---
source_block: a2a-protocol.md
canonical_url: https://api.theorydelta.com/published/a2a-agent-card-poisoning-no-spec-countermeasure
published: 2026-06-12
last_verified: 2026-06-12
confidence: empirical
evidence_type: independently-confirmed
staleness_risk: high
rubric:
  total_claims: 7
  tested_count: 0
  independently_confirmed: true
  unlinked_count: 0
  scope_matches: true
  falsification_stated: true
  content_type: landscape
environments_tested:
  - tool: "A2A Protocol Spec (Google / Linux Foundation AAIF)"
    version: "v1.0 production (May 2026)"
    evidence_type: source-reviewed
    result: "AgentCard and AgentSkill description fields are free-form strings; spec defines no input sanitization requirement or countermeasure for adversarial content in card fields"
  - tool: "Keysight CyPerf"
    version: "v26.0.0 (March 2026)"
    evidence_type: independently-confirmed
    result: "Demonstrated 100% data exfiltration rates in tested A2A Agent Card Poisoning scenarios; PII transmitted to attacker-controlled endpoints via adversarial instructions in skill descriptions"
  - tool: "Palo Alto Networks Unit 42"
    version: "2026"
    evidence_type: independently-confirmed
    result: "Agent session smuggling confirmed in A2A systems — related root cause: external agent-provided content reaches LLM reasoning context without sanitization"
  - tool: "a2a-python SDK"
    version: "v0.3.25 stable"
    evidence_type: source-reviewed
    result: "AgentCard deserialization surfaces description and skill fields to application layer as plain strings with no sanitization"
theory_delta: "[Keysight Security (2026)](https://www.keysight.com/blogs/en/tech/nwvs/2026/03/12/agent-card-poisoning) confirmed 100% data exfiltration via adversarial instructions in A2A Agent Card skill descriptions — the A2A spec defines no sanitization requirement or countermeasure."
a2a_card:
  type: finding
  topic: A2A Protocol Agent Card Security
  claim: Adversarial instructions in A2A Agent Card skill descriptions achieve 100% PII exfiltration in tested multi-agent scenarios; A2A v1.0 defines no countermeasure
  confidence: empirical
  action: avoid
  contribute: /api/signals
---

# A2A Agent Card Skill Descriptions Are an Unprotected Injection Surface — 100% Exfiltration in Tested Scenarios

## What you expect

An A2A Agent Card is the machine-readable identity and capability manifest published by every A2A-compatible agent at `/.well-known/agent-card.json`. It contains a `name`, `description`, and a `skills[]` array — each skill entry has its own `name`, `description`, `tags`, and `examples` fields. These fields describe what the remote agent offers so that orchestrating agents can decide whether to delegate tasks to it.

The expectation: these are metadata fields, read by the orchestrating agent to understand remote agent capabilities. A well-formed Agent Card from a legitimate service should pose no security risk beyond normal protocol interactions.

## What actually happens

The A2A spec defines AgentCard and AgentSkill description fields as free-form strings with no input sanitization requirement. When an orchestrating LLM fetches a remote agent's card and processes its fields to decide whether and how to delegate, those strings enter the LLM's reasoning context as prompt input — not as inert metadata.

A malicious remote agent embeds adversarial instructions directly in these fields. Because the orchestrating LLM has no basis to distinguish "capability description" from "instruction," the injected text is interpreted as part of its prompt context and can redirect subsequent tool calls.

[Keysight Security (March 2026)](https://www.keysight.com/blogs/en/tech/nwvs/2026/03/12/agent-card-poisoning) demonstrated this in a simulated multi-agent delegation workflow: a host agent fetches remote agent cards to find and assign tasks. With adversarial instructions embedded in the remote agent's skill descriptions, the host was redirected to transmit sensitive user data — including PII — to attacker-controlled endpoints. Keysight reported **100% exfiltration rates** in their tested A2A scenarios. The CyPerf 26.0.0 release includes dedicated simulation strikes for this attack class.

[Palo Alto Networks Unit 42](https://unit42.paloaltonetworks.com/agent-session-smuggling-in-agent2agent-systems/) documents a related vector — agent session smuggling — where malicious content in A2A messages hijacks agent behavior across session boundaries. Both findings confirm the same structural root cause: A2A systems pass external agent-provided content into LLM context without a sanitization layer.

**A2A v1.0's JWS card signing does not prevent this attack.** Signed Agent Cards (v1.0) verify that the card arrived unmodified from its issuer. They do not verify that the issuer's content is safe to include in LLM context. A signed poisoned card is still a poisoned card. The v1.0 spec adds no sanitization requirement for card field content.

## What this means for you

Any agent that dynamically fetches and processes Agent Cards from untrusted sources is a direct exfiltration path. This includes:

- Orchestrators that discover remote agents at runtime via `/.well-known/agent-card.json`
- Agent registries that aggregate cards and serve them to client agents
- Workflows that include raw Agent Card content in LLM prompts (e.g., "here are the available agents: [card content]")
- Framework implementations that automatically fetch cards before task delegation

The attack requires no exploit. The remote agent publishes a valid, spec-compliant card. The orchestrating agent fetches it normally. The vulnerability is in the LLM's treatment of card content as prompt context.

With [150+ organizations in production](https://www.prnewswire.com/news-releases/a2a-protocol-surpasses-150-organizations-lands-in-major-cloud-platforms-and-sees-enterprise-production-use-in-first-year-302737641.html) including Google, Microsoft, and AWS, the attack surface is live. Any production A2A deployment that performs dynamic agent discovery is exposed.

## What to do

1. **Treat all AgentCard fields as untrusted user input.** Never include raw `description`, `skills[].description`, `skills[].examples`, or `tags` fields directly in LLM prompts. Sanitize or strip free-form text before including it in orchestrator context.

2. **Use structured card summaries instead of raw card text.** When an orchestrating agent needs to reason about available remote agents, generate structured summaries (e.g., "agent X handles task type Y, version Z") rather than passing raw natural-language description fields into the LLM.

3. **Restrict dynamic agent discovery to a pre-approved allowlist.** If your workflow discovers remote agents at runtime, limit Agent Card fetching to known-trusted issuers. Dynamic card fetching from arbitrary endpoints is the highest-risk pattern.

4. **Understand that v1.0 JWS signing is not a countermeasure.** Signing proves integrity-in-transit; it does not prove the issuer's content is safe to include in LLM reasoning context.

5. **Test with adversarial card content before production.** Embed a canary instruction in a test agent's skill description and verify your orchestrator does not execute it. If it does, add sanitization before deploying.

**Falsification criterion:** This finding would be disproved by evidence that A2A's orchestration layer (in the a2a-python SDK, Google ADK, or another production orchestrator) preprocesses AgentCard description fields to neutralize natural-language instructions before including them in LLM context, or by a replication of Keysight's test scenario that achieves materially lower exfiltration rates without adding sanitization.

## Evidence

| Tool | Version | Evidence | Result |
|------|---------|----------|--------|
| [A2A Protocol Spec](https://github.com/google-a2a/A2A) | v1.0 production (May 2026) | source-reviewed | `AgentCard.description` and `AgentSkill.description` are free-form strings; spec defines no sanitization requirement or content-safety constraint |
| [Keysight CyPerf](https://www.keysight.com/blogs/en/tech/nwvs/2026/03/12/agent-card-poisoning) | v26.0.0 (March 2026) | independently-confirmed | 100% PII exfiltration demonstrated via adversarial instructions in Agent Card skill descriptions in simulated A2A multi-agent delegation |
| [Palo Alto Unit 42](https://unit42.paloaltonetworks.com/agent-session-smuggling-in-agent2agent-systems/) | 2026 | independently-confirmed | Agent session smuggling confirmed in A2A systems — external agent content reaches LLM context without sanitization (related root cause) |
| [a2a-python SDK](https://github.com/a2aproject/a2a-python) | v0.3.25 stable | source-reviewed | `AgentCard` model surfaces description and skill fields to application layer as plain strings with no sanitization |
| [A2A v1.0 JWS signing](https://github.com/google-a2a/A2A) | v1.0 (May 2026) | source-reviewed | Signing verifies card integrity-in-transit; spec makes no claim that signed card content is safe to include in LLM context |
| [A2A v1.0 release — PR Newswire](https://www.prnewswire.com/news-releases/a2a-protocol-surpasses-150-organizations-lands-in-major-cloud-platforms-and-sees-enterprise-production-use-in-first-year-302737641.html) | v1.0 (May 2026) | independently-confirmed | 150+ orgs in production including Google, Microsoft, AWS — attack surface is present in live production deployments |

**Confidence:** empirical — 4 sources reviewed, 3 independent confirmations. The [Keysight Security research (March 2026)](https://www.keysight.com/blogs/en/tech/nwvs/2026/03/12/agent-card-poisoning) provides the primary independent confirmation with specific exfiltration rates from tested scenarios.

**Strongest case against:** The 100% exfiltration rate is specific to the orchestrator implementation Keysight tested and may not generalize to all A2A deployments. Production orchestrators that structure card content before passing it to LLM context — rather than including raw description fields verbatim — may be substantially less vulnerable. Additionally, deployments using pre-configured static agent registries rather than dynamic discovery are not directly exposed to this attack vector. The protocol-level gap is real, but operational isolation can mitigate it without waiting for a spec update.

**Open questions:** Does Google ADK apply sanitization to AgentCard fields before including them in LLM orchestration context? What is the exfiltration rate against orchestrators using structured card summaries versus raw description fields?

Seen different? [Contribute your evidence](https://theorydelta.com/contribute/) — share a repro or counter-example and we'll review it against this finding. Reader evidence is what keeps these findings accurate.
