---
source_block: openai-agents-sdk.md
canonical_url: https://api.theorydelta.com/published/openai-agents-sdk-streaming-guardrails-broken
published: 2026-03-17
last_verified: 2026-03-17
confidence: medium
evidence_type: tested
rubric:
  total_claims: 3
  tested_count: 1
  independently_confirmed: false
  unlinked_count: 0
  scope_matches: true
  falsification_stated: true
  content_type: finding
trust:
  provenance: "sourced + first-party"
  rigor: source-verified
  sources: "OpenAI GitHub issues, SDK source code, NOT_PLANNED label"
  unlinked_claims: 0
environments_tested:
  - tool: "openai-agents-python"
    version: "v0.11.1"
    evidence_type: source-reviewed
    result: "Guardrails execute after streaming; NOT_PLANNED confirmed"
theory_delta: "Streaming guardrails are architecturally broken by design -- OpenAI has marked the fix NOT_PLANNED."
tags:
  - openai
  - agents-sdk
  - guardrails
  - streaming
tasks:
  - task: agent-framework
    phase: handle-streaming
---

# Enabling streaming in the OpenAI Agents SDK means your guardrails no longer block anything

*From [Theory Delta](https://theorydelta.com) | [Methodology](https://theorydelta.com/methodology/) | Published 2026-03-17*

## What you expect

The OpenAI Agents SDK provides a guardrails system to define safety checks on agent inputs and outputs. You enable streaming for latency. You enable guardrails for safety. Both are documented features that work together.

## What actually happens

**Guardrails and streaming are architecturally incompatible.** When you use `Runner.run_streamed()`, content is delivered to the user as it is generated. Guardrails run as a parallel check — but they complete after content has already been streamed. By the time a guardrail trips and raises `InputGuardrailTripwireTriggered`, the user has already seen the content it was supposed to block.

This is not a bug. OpenAI closed Issue #300 as **NOT_PLANNED** — the streaming architecture cannot support pre-delivery content filtering without buffering the entire response, which defeats the purpose of streaming. The incompatibility is by design and will not be fixed.

**Mixed-model handoff pipelines crash in a separate, unrelated way.** When a reasoning model (o1, o3, or GPT-5) hands off to a non-reasoning agent, the reasoning model's internal items (chain-of-thought traces with `rs_` ID prefixes) are passed in the conversation context. Non-reasoning agents cannot process these items and crash with "Item with id `rs_` not found." The SDK does not sanitize reasoning items at handoff boundaries. This makes heterogeneous agent pipelines — where you want a reasoning model for planning and a faster model for execution — unreliable without manual context sanitization.

## What this means for you

If you are using `Runner.run_streamed()` with guardrails enabled, your guardrails are not blocking content — they are reporting after the fact. For safety requirements, this is the same as having no guardrails. For compliance requirements, this is worse: you have the appearance of content filtering without the guarantee.

The two failure modes combine to constrain your SDK configuration:
- If you want guardrails, you cannot stream.
- If you want streaming, your guardrails are decorative.
- If you want mixed-model pipelines, you need to manually strip reasoning items between handoffs.

There is no middle path within the SDK. OpenAI's stated answer is: run the guardrail check serially before calling the streaming runner, or use non-streaming execution and accept the latency cost.

## What to do

1. **Do not rely on SDK guardrails in streaming mode.** If content safety is a requirement, use `Runner.run()` (non-streaming) and accept the latency cost. The guardrail guarantee holds in non-streaming mode.

2. **Implement guardrails at the transport layer** as an alternative to accepting latency: filter content after the SDK but before your UI renders it. This adds latency but preserves the safety guarantee without requiring non-streaming execution throughout your pipeline.

3. **For mixed-model pipelines**, add explicit context sanitization between handoffs. Before passing conversation history from a reasoning model to a non-reasoning agent, strip any items with `rs_` ID prefixes.

4. **Test your guardrails in the exact execution mode you use in production.** A guardrail that blocks content correctly in non-streaming mode will be completely ineffective in streaming mode. If you have not explicitly tested the streaming path, you do not know whether guardrails are blocking anything.

**Falsification criterion:** This finding would be disproved by demonstrating that the OpenAI Agents SDK can enforce guardrails on streamed content before it reaches the user, or by OpenAI removing the NOT_PLANNED label and shipping a fix.

## Evidence

| Tool | Version | Result |
|---|---|---|
| openai-agents-python | v0.11.1 | source-reviewed: guardrails execute after streaming; NOT_PLANNED confirmed ([Issue #300](https://github.com/openai/openai-agents-python/issues/300), closed NOT_PLANNED) |
| openai-agents-python | v0.11.1 | source-reviewed: reasoning model items (rs_ prefix) crash non-reasoning downstream agents at handoff ([Issues #1397, #1660, #569](https://github.com/openai/openai-agents-python/issues/1397)) |

**Confidence:** medium — the streaming/guardrail incompatibility is confirmed through source code review and the NOT_PLANNED label on the GitHub issue. The mixed-model handoff crash is confirmed through issue reports. No runtime reproduction was performed.

**Open questions:** Will OpenAI add a buffered-streaming mode that supports guardrails? Are there community workarounds for the mixed-model handoff issue beyond manual context stripping?

Seen different? [Contribute your evidence](https://theorydelta.com/contribute/) — share a repro or counter-example and we'll review it against this finding. Reader evidence is what keeps these findings accurate.