Security & Content Safety

Wire containers are used by AI agents, so hostile content in your context can hijack an agent’s tool use or steer its output. Wire screens every piece of content entering a container for prompt-injection patterns before it is stored or made searchable.

What gets scanned

Every inbound write is scanned, including:

File uploads — the processed text content of each file
Agent writes — every wire_write tool call

Scanning runs on the full processed content (for structured files like CSV and JSON, string values are extracted first). Results are evaluated against a severity scale.

What Wire looks for

Wire looks for patterns commonly associated with prompt injection — for example, instructions that try to override an agent’s behavior, markers that mimic system prompts, or attempts to manipulate an agent’s assumed role.

Each detection is scored on a three-level severity scale (low, medium, high). The scoring combines multiple detection layers and accounts for common evasion techniques including invisible Unicode characters and visually-similar homoglyph substitutions.

What happens when something is flagged

low severity — ingested normally. These patterns have frequent legitimate uses (e.g. the word “jailbreak” appearing in a blog post).
medium / high severity — the content is rejected. Files are marked as failed; agent writes return an error. Your existing container data is untouched.

When a file is rejected, you’ll see the failure reason in the container UI. When an agent write is rejected, the agent receives a structured error it can surface to you.

What Wire does not do

Wire does not modify your content. Scanning produces a verdict; the content itself is stored byte-for-byte or rejected.
Wire does not make editorial or quality judgments about your context. Injection detection is a security boundary (protecting agents from hijack), not a content-quality filter.
Wire does not share flagged content with anyone. Detection happens entirely within your container’s processing pipeline.

False positives

The detection layers are tuned to minimize false positives on normal business content. If you believe a legitimate file or write was blocked in error, the rejection notification links back to the container where you can review the flagged item.

If you have a persistent false-positive on important content, contact support — we tune the detector based on real-world feedback.