---
source_block: localagi.md
canonical_url: https://api.theorydelta.com/published/localagi-infrastructure-bugs-fifty-percent-tool-call-failure
published: 2026-06-05
last_verified: 2026-05-09
confidence: empirical
evidence_type: independently-confirmed
staleness_risk: low
rubric:
  total_claims: 8
  tested_count: 0
  independently_confirmed: true
  unlinked_count: 0
  scope_matches: true
  falsification_stated: true
  content_type: landscape
environments_tested:
  - tool: "LocalAI (mudler/LocalAI)"
    version: "v4.1.3 and earlier (issues through May 2026)"
    evidence_type: source-reviewed
    result: "Response serialization gap: backend executes tool calls and confirms in logs, HTTP API returns empty ToolCalls[] or error — confirmed across gemma-3-4b, cogito-v1-preview, hermes-4-14b, Gemma 4"
  - tool: "LocalAGI (mudler/LocalAGI)"
    version: "Dec 2025 – May 2026 (issues through Feb 2026)"
    evidence_type: source-reviewed
    result: "Zero-parameter tool schema encoding bug causes ~99% failure rate for tools like list_memory; HTTP MCP transport enters permanent broken state after any connection reset"
  - tool: "LocalAI streaming parser"
    version: "Current (issue #9722, May 8 2026)"
    evidence_type: source-reviewed
    result: "Two uncoordinated parsers (C++ autoparser + Go JSON parser) both fire on same content, emitting duplicate tool calls at different indices with no deduplication"
# theory_delta renders as a visible "The delta" TL;DR block on the finding page.
# Voice matches evidence_type (source-reviewed): "The receipts are public"
theory_delta: "The receipts are public: six open GitHub issues confirm LocalAGI's ~50% MCP tool-call failure rate is an infrastructure defect — backend execution logs show tool calls completing while the HTTP response layer drops or corrupts the result — not a model capability problem."
a2a_card:
  type: finding
  topic: local-ai mcp tool-calling reliability
  claim: LocalAGI's ~50% MCP tool-call failure rate is caused by six independent infrastructure bugs in LocalAI and LocalAGI's HTTP response serialization, streaming parser coordination, and connection management layers — not by model capability limitations.
  confidence: empirical
  action: avoid
  contribute: /api/signals
---

# LocalAGI's 50% Tool-Call Failure Rate Is an Infrastructure Bug, Not a Model Problem

## What you expect

LocalAGI is a local agent framework backed by LocalAI for inference. When a tool call fails, the natural assumption is model capability: the LLM produced malformed output, or the chosen model (gemma-3-4b, cogito, hermes) is too weak for reliable function calling. Switching to a larger or better-instruction-tuned model should improve reliability.

## What actually happens

Six independent infrastructure bugs in LocalAI and LocalAGI compound to produce a ~50% MCP tool-call failure floor. Backend logs show tool execution completing successfully while the HTTP response layer drops or corrupts the result. Switching models does not fix any of them.

### Bug 1: Response Serialization Gap (50–70% failure rate)

[mudler/LocalAI#7772](https://github.com/mudler/LocalAI/issues/7772) — OPEN, reported December 2025.

Backend execution logs confirm tool calls completing and results being retrieved. The LocalAI HTTP response returns `"Invalid http method"` with no choices and no completion data in 50–70% of cases. Confirmed across gemma-3-4b-it-qat, cogito-v1-preview-qwen-14B, and nousresearch_hermes-4-14b. The defect is in LocalAI's MCP response path — not in model output.

### Bug 2: Zero-Parameter Tool Schema Bug (~99% failure rate)

[mudler/LocalAGI#362](https://github.com/mudler/LocalAGI/issues/362) — OPEN, reported November 2025.

Tools with no required parameters — `list_memory`, `list_reminders`, and similar — fail with most models. The reporter observed successful execution only once across many attempts. Adding a dummy required parameter as a workaround restores functionality, which confirms the bug is in parameter schema encoding at the LocalAGI layer, not in model comprehension of the tool's purpose.

### Bug 3: Streaming Parser Duplication

[mudler/LocalAI#9722](https://github.com/mudler/LocalAI/issues/9722) — OPEN, reported May 8, 2026.

When streaming `/v1/chat/completions` with tool calls, the same function call appears at multiple index values. The root cause: two concurrent parsers operate without coordination — a C++ chat-template autoparser and a Go iterative JSON parser both fire independently on the same accumulated content. No deduplication exists between them. This is the most recently reported evidence, confirming the issues persist in the current codebase.

### Bug 4: Tool-Choice Grammar Silent Failure (partially fixed)

[mudler/LocalAI#9508](https://github.com/mudler/LocalAI/issues/9508) — PARTIALLY FIXED (PR #9509 covers one of four call sites), reported April 23, 2026.

Specifying `tool_choice: {type: "function", function: {name: "X"}}` silently fails to enforce grammar constraints. The model receives the tool list but no constraint, and produces free-text output instead of a tool call. Root cause: four code locations in LocalAI use `SetFunctionCallString` (which sets mode) where `SetFunctionCallNameString` (which sets the function name) is required:

- `core/http/middleware/request.go:620`
- `core/http/endpoints/anthropic/messages.go:883`
- `core/http/endpoints/openai/realtime_model.go:171`
- `core/http/endpoints/openresponsei/responses.go:776`

A separate bug causes string-format `tool_choice` values (e.g., `"required"`) to be silently dropped due to unmarshaling errors without error propagation. Three of the four setter sites remain unfixed as of the issue date.

### Bug 5: Empty ToolCalls[] in API Response

[mudler/LocalAI#9334](https://github.com/mudler/LocalAI/issues/9334) — OPEN, reported April 13, 2026.

With Gemma 4 in LocalAI v4.1.3, tool execution completes and is visible in backend traces, but the API response returns `ToolCalls: []`. The retry mechanism triggers five times before failing. This is independently confirmed as the same serialization gap as Bug 1 — on a different model — establishing that the response serialization defect is not model-specific.

### Bug 6: MCP HTTP Transport No-Recovery

[mudler/LocalAGI#418](https://github.com/mudler/LocalAGI/issues/418) — OPEN, reported February 16, 2026.

After any temporary network disruption, HTTP-based MCP transport connections enter a permanently broken state. Subsequent tool listing fails with `"client is closing: standalone SSE stream: failed to connect: Bad Request"`. Recovery requires restarting the entire LocalAGI agent. There is no automatic reconnection or connection-state monitoring for HTTP MCP transports.

### Additional: Plan Re-Evaluation Nil Crash

[mudler/LocalAGI#428](https://github.com/mudler/LocalAGI/issues/428) — OPEN, reported February 23, 2026.

When a subtask exhausts retries, the plan re-evaluation callback in `plan.go:230` receives a nil pointer reference, triggering a `runtime error: invalid memory address or nil pointer dereference`. The agent crashes rather than attempting alternative strategies.

## What this means for you

A builder diagnosing 50% tool-call failures in LocalAGI is likely chasing the wrong root cause. The failure is not in the model layer — it is in the HTTP response serialization, streaming parser coordination, and connection resilience layers of LocalAI and LocalAGI.

The bugs **compound**: a request that survives the serialization gap (Bug 1) may still fail due to grammar constraint dropping (Bug 4) or duplicate emission (Bug 3). No single fix resolves the aggregate failure floor. Four of the six bugs (Bugs 1, 4, 5, 6) have no application-layer workaround and require upstream fixes in LocalAI itself.

The most recent confirmed evidence is [Issue #9722 from May 8, 2026](https://github.com/mudler/LocalAI/issues/9722), showing the streaming parser duplication persists in the current codebase. All evidence comes from development and homelab contexts; production reliability at scale is unconfirmed.

## What to do

Diagnose in this order to isolate which bug is affecting your deployment:

1. **Check for zero-parameter tools** — if any of your tools have no required parameters, add a dummy required parameter as a temporary workaround (fixes Bug 2).
2. **Disable streaming** — set streaming to false and check whether the failure rate drops (Bug 3 is streaming-only).
3. **Check if `tool_choice` is forced** — test with `tool_choice: auto` to isolate grammar-enforcement failures (Bug 4). If reliability improves, you are hitting Bug 4.
4. **Check backend logs** — if logs show successful tool execution but API returns empty or errored results, you are hitting the response serialization gap (Bugs 1 and 5). These have no application-layer workaround; wait for upstream fixes.
5. **For MCP transport failures** — any connection drop requires a full agent restart (Bug 6). Design for agent restarts as a normal operational event, not an exception.

If you need reliable tool calling from local models today, consider using Ollama with a model that has strong native function-calling support rather than LocalAI's MCP layer — Ollama's tool-call implementation is documented in a separate block ([ollama-local-inference.md](https://theorydelta.com/findings/)) and has a different failure surface.

**Falsification criterion:** This finding would be disproved by evidence that LocalAI's HTTP response serialization layer correctly returns tool-call results in all cases (i.e., the issue reporters' backend-confirms-but-HTTP-drops pattern was due to user misconfiguration, not a LocalAI defect), OR by a LocalAI release that resolves Issues #7772, #9334, #9722, and #9508 and shows the aggregate tool-call success rate rising above 80% across multiple models in independent testing.

## Evidence

| Tool | Version | Evidence | Result |
|------|---------|----------|--------|
| [LocalAI](https://github.com/mudler/LocalAI) | v4.1.3 and earlier ([#7772](https://github.com/mudler/LocalAI/issues/7772)) | source-reviewed | Backend executes tool, HTTP API returns error or empty ToolCalls[]; confirmed across 3+ models |
| [LocalAGI](https://github.com/mudler/LocalAGI) | Nov 2025 – Feb 2026 ([#362](https://github.com/mudler/LocalAGI/issues/362)) | source-reviewed | Zero-param tools fail ~99% of the time; dummy param workaround confirms schema encoding bug |
| [LocalAI streaming](https://github.com/mudler/LocalAI) | Current ([#9722](https://github.com/mudler/LocalAI/issues/9722), May 8 2026) | source-reviewed | C++ autoparser + Go JSON parser both fire independently, emitting duplicate tool calls |
| [LocalAI grammar enforcement](https://github.com/mudler/LocalAI) | Current ([#9508](https://github.com/mudler/LocalAI/issues/9508), Apr 2026) | source-reviewed | `SetFunctionCallString` used at 4 locations where `SetFunctionCallNameString` required; 3 of 4 unfixed |
| [LocalAI](https://github.com/mudler/LocalAI) | v4.1.3 ([#9334](https://github.com/mudler/LocalAI/issues/9334), Apr 2026) | independently-confirmed | Empty ToolCalls[] with Gemma 4 — independently confirms Bug 1 serialization gap across different model |
| [LocalAGI HTTP transport](https://github.com/mudler/LocalAGI) | Feb 2026 ([#418](https://github.com/mudler/LocalAGI/issues/418)) | source-reviewed | Permanent broken state after connection reset; restart required; no auto-reconnect |
| [LocalAGI plan re-eval](https://github.com/mudler/LocalAGI) | Feb 2026 ([#428](https://github.com/mudler/LocalAGI/issues/428)) | source-reviewed | Nil pointer dereference at plan.go:230 on retry exhaustion; agent crashes |

**Confidence:** empirical — 7 GitHub issues reviewed, 2 independently confirming the same serialization gap on different models. All evidence from mudler/LocalAI and mudler/LocalAGI issue trackers (toolmaker-filed, highest-confidence evidence class for behavioral claims).

**Strongest case against:** LocalAGI is early-stage open-source software maintained by a small team. The ~50% failure rate comes from reporters working in development and homelab contexts — not production telemetry. Some of these bugs may have been fixed in releases not yet reflected in the issue tracker. Bug 8 (FastMCP Accept-header mismatch, [#283](https://github.com/mudler/LocalAGI/issues/283)) was already resolved via PR #318 in October 2025. A builder who keeps dependencies current and disables streaming may see materially better reliability than the ~50% floor. The aggregate figure is a worst-case composition of independent bugs, not a measured end-to-end rate from a standardized benchmark.

**Open questions:** What is the actual failure rate for non-streaming HTTP-only paths after Bug 3 (streaming-specific) is excluded? Have Bugs 1 and 5 been confirmed to share the same code path, or are they two independent serialization defects? Does the failure rate improve significantly on LocalAI versions after v4.1.3?

Seen different? [Contribute your evidence](https://theorydelta.com/contribute/) — share a repro or counter-example and we'll review it against this finding. Reader evidence is what keeps these findings accurate.