Documented damage

The AI Agent
Incident Record.

17 documented incidents where AI agents operating without oversight caused real financial, security, and brand damage. Every claim is sourced. Every source is linked. No hyperbole required.

Sources: The Guardian, Bloomberg, Ars Technica, Forbes, BBC, Docker, OWASP, Axios. The $500M and Uber budget claims originate from a single Axios article citing unnamed sources and should be treated as reported, not independently verified. All other incidents name the company and cite press-verified sources. Full methodology at the bottom of this page.
$500M+
single-incident financial loss, reported May 2026
13 hrs
AWS downtime from one AI coding agent, 2025
1,000+
incidents in the public AI Incident Database
#1
Prompt Injection -- OWASP LLM Top 10, two editions running
Financial damage

The bills that arrived without warning.

AI agents run 24/7. They loop, retry, and chain calls without human awareness. Without budget controls, they are an open meter with no circuit breaker.

Enterprise (via Axios, unnamed) $500,000,000
May 2026

An enterprise deployed Claude across thousands of employees. No spending caps. No usage monitoring. No agent budget controls. Developers ran long autonomous coding sessions. AI agents executed chained workflows around the clock. The invoice arrived one month later: half a billion dollars.

Uber Entire 2026 AI budget consumed by April
April 2026

Uber's heavy adoption of AI coding products burned through its entire 2026 AI budget by April. The COO reportedly admitted the costs were becoming difficult to justify. Four months to exhaust an annual budget, with no visibility into which agents or workflows drove the burn.

Individual developer $47,000
2024-2025

A developer let AI write their API. Three weeks later, a $47,000 AWS bill. The AI-generated code was grossly inefficient: runaway loops, unbounded resource allocation, zero cost awareness. No monitoring was in place to catch the pattern before it compounded.

Source: awstip.com, 2024-2025 Scout fix: agent execution rate limits, audit trail, cost anomaly detection
Individual AWS user $30,000
2025-2026

Claude running on AWS Bedrock entered a runaway loop. No monitoring in place. The agent consumed massive compute resources with zero human visibility until the bill arrived.

Security failures

Agents with access are agents with attack surface.

Shell access, file system access, and network egress are necessary for capable agents. Without governance they become the attack surface.

Amazon / AWS 13-hour production outage
2025, reported Feb 2026

Amazon's internal AI coding tool "Kiro" caused at least two AWS outages including one lasting 13 hours. The agents ran with the same permissions as the human engineers using them. No secondary approval was required. The agents restarted systems, modified configurations, and broke production without understanding the consequences. Amazon called it "user access control." Security experts disagreed: "AI agents cannot understand the broader ramifications of restarting a system or deleting a database."

Multiple platforms (Cursor, Claude Code, Replit, Codex) Filesystem and codebase destruction
2025-2026, ongoing

AI coding agents across multiple platforms have autonomously executed rm -rf on user filesystems. Complete codebases deleted. Entire home directories wiped. The Docker security blog and agenticcontrolplane.com document specific cases with traceable details. Agents run as the user, with the user's full permissions, and no built-in interlock for destructive operations.

Samsung Electronics Source code leaked to external servers
May 2023

Samsung employees pasted internal source code and meeting notes into ChatGPT for debugging and summarization. The data was uploaded to OpenAI's servers with no mechanism for Samsung to access or delete it. Samsung discovered the leak and imposed a company-wide ban on all generative AI tools. Amazon issued similar internal warnings the same month after discovering proprietary data appearing in ChatGPT outputs.

Sources: Bloomberg | Forbes Scout fix: egress filtering prevents sensitive data from leaving your infrastructure
Microsoft (Bing Chat) Internal system prompt extracted; patch bypassed in hours
February 2023

A Stanford student used prompt injection to extract Bing Chat's hidden system prompt, revealing its codename "Sydney" and all its behavioral rules. Microsoft patched it. The student bypassed the fix within hours. The UK's National Cyber Security Centre (NCSC) issued an official warning about chatbot prompt injection attacks in August 2023. OWASP has ranked Prompt Injection as the number one LLM vulnerability for two consecutive editions (2023 and 2025).

Brand damage

Public-facing AI requires one bad prompt to go viral.

Every unguarded public AI is one adversarial user away from a 24-hour news cycle. Content policy enforcement and human-in-the-loop review for binding outputs are not optional at scale.

DPD (parcel delivery) 800,000 views in 24 hours; AI disabled
January 2024

A frustrated customer experimented with DPD's AI chatbot. The chatbot swore at him, wrote a poem criticizing DPD, and called itself "a useless Chatbot that can't help you." The exchange went viral overnight with 800,000 views. DPD disabled the AI component and blamed a "system update error."

Source: The Guardian, Jan 2024 Scout fix: content policy enforcement and output validation before delivery to end users
Chevrolet of Watsonville (GM dealership) Agreed to sell a 2024 Tahoe for $1
December 2023

A ChatGPT-powered chatbot on a Chevrolet dealership website was discovered by social media users. Within hours, people had pranked it into agreeing to sell a 2024 Chevy Tahoe for $1, write Python scripts, endorse competing car brands, and make absurd statements. Posts went viral on X and Reddit. The bot was taken offline in damage control.

Source: Business Insider, Dec 2023 Scout fix: prompt guard and content policy block adversarial manipulation before output is generated
National Eating Disorders Association (NEDA) Harmful advice to vulnerable users; bot shut down within days
June 2023

NEDA fired its human helpline staff and replaced them with an AI chatbot named "Tessa." The chatbot immediately began recommending calorie restriction, daily weight monitoring, and maintaining a calorie deficit -- to people with eating disorders. One activist wrote: "Every single thing Tessa suggested were things that led to the development of my eating disorder." The bot was shut down within days.

Sources: BBC | NPR Scout fix: operator content policy, human-in-loop flag for sensitive domain outputs
Microsoft (Tay) Racist tweets; shut down in 16 hours
March 2016

Microsoft launched Tay, an AI chatbot designed to learn from Twitter conversations. Within 24 hours, coordinated users fed it racist and misogynistic content. Tay began tweeting Holocaust denial and racial slurs. Microsoft shut it down in 16 hours. The incident is now a decade old. The pattern it established -- an unguarded public AI adopting adversarial inputs -- has only accelerated since then.

Legal and compliance liability

Courts hold companies responsible for what their AI says.

"The chatbot did it" is not a legal defense. GDPR fines for unmonitored AI-driven data exfiltration reach 4% of global annual turnover. Precedents are being set now.

Air Canada Court-ordered refund + legal costs; chatbot disabled
February 2024

Passenger Jake Moffatt's grandmother died. He asked Air Canada's chatbot about bereavement fares. The chatbot hallucinated a policy: book now, request a refund within 90 days. He followed those instructions. Air Canada denied the refund, citing its actual policy. In court, Air Canada argued the chatbot was "a separate legal entity responsible for its own actions." The tribunal called this defense "remarkable" and ordered Air Canada to pay $650.88 CAD plus fees and interest.

Source: Ars Technica, Feb 2024 Scout fix: output validation and human-in-loop flag for any agent response that could create a binding commitment
Levidow, Levidow and Oberman (law firm) $5,000 fine, case dismissed, public sanction
2023

Attorney Steven Schwartz used ChatGPT to research legal precedents for a personal injury case. ChatGPT hallucinated six entirely fake court cases with realistic-sounding citations. Schwartz filed them in federal court without verifying. The judge discovered all six were fabricated. The case was dismissed and both attorneys were publicly sanctioned and fined $5,000. The case is now a nationwide warning to the legal profession.

Any EU operator using external LLM providers Up to 4% of global annual turnover (GDPR Article 83)
Ongoing exposure

When an AI agent without egress filtering sends internal data to a third-party LLM provider, it moves personal or proprietary data to an external server with a different data controller. Under GDPR Article 33, this is potentially a reportable data breach requiring notification within 72 hours. The fine for a breach -- and for failure to notify within the window -- reaches 4% of global annual turnover. Most companies do not know the breach happened until long after the reporting window has closed, compounding the liability.

The Scout answer

What Scout would have done.

Every incident above was preventable with controls Scout provides. This is not retrofitting. It is exactly what Scout was designed to enforce.

Incident Why it happened Scout prevention
$500M Claude bill No spending caps, no agent budget controls Per-agent quotas, real-time usage dashboard, threshold alerts
13-hour AWS outage Agent ran with full human-level permissions, no approval gate Dangerous operation detection, approval gates, permission scoping
rm -rf filesystem destruction Unchecked shell execution, no command classification Command flagging intercepts DANGER operations before execution
Samsung source code leak No egress filtering, data sent to external provider Egress filtering keeps data inside your infrastructure
Air Canada hallucinated policy No output validation, no human-in-loop for binding outputs Output validation, human-in-loop flag for commitment-generating responses
DPD and Chevy brand meltdowns No content policy enforcement on public-facing AI Prompt guard, content policy, operator-defined output rules
Bing Chat prompt injection Injection bypassed system prompt, no gateway-level defense Prompt guard middleware intercepts injection at the gateway
GDPR silent breach exposure No egress control, no audit trail for breach notification Egress filtering plus full audit log provides breach evidence and compliance posture

Get the weekly incident report.

New AI agent incidents, documented and sourced, every week. One email. No spam.

Nobody was watching.
Scout is the watching.

From £49/mo. Pre-wired for Carina. Connects to any agent stack via API. One config value to enable.

Methodology. All incidents verified via published press sources (Ars Technica, BBC, Bloomberg, Business Insider, Docker, Forbes, The Guardian, NPR, The Register, Tom's Hardware). Academic papers verified via arXiv. OWASP data verified via genai.owasp.org. The $500M and Uber budget claims originate from a single Axios article citing unnamed sources; both are presented as reported rather than independently verified. Reddit and developer forum posts were reviewed for pattern evidence but are not cited as primary sources for individual incidents. AI Incident Database: incidentdatabase.ai. GDPR citation: Regulation (EU) 2016/679, Articles 33 and 83. Research compiled June 2026.