Documented damage

The AI Agent
Incident Record.

17 documented incidents where AI agents operating without oversight caused real financial, security, and brand damage. Every claim is sourced. Every source is linked. No hyperbole required.

Sources: The Guardian, Bloomberg, Ars Technica, Forbes, BBC, Docker, OWASP, Axios. The $500M and Uber budget claims originate from a single Axios article citing unnamed sources and should be treated as reported, not independently verified. All other incidents name the company and cite press-verified sources. Full methodology at the bottom of this page.

$500M+

single-incident financial loss, reported May 2026

13 hrs

AWS downtime from one AI coding agent, 2025

1,000+

incidents in the public AI Incident Database

Prompt Injection -- OWASP LLM Top 10, two editions running

Financial damage

The bills that arrived without warning.

AI agents run 24/7. They loop, retry, and chain calls without human awareness. Without budget controls, they are an open meter with no circuit breaker.

Enterprise (via Axios, unnamed) $500,000,000

May 2026

An enterprise deployed Claude across thousands of employees. No spending caps. No usage monitoring. No agent budget controls. Developers ran long autonomous coding sessions. AI agents executed chained workflows around the clock. The invoice arrived one month later: half a billion dollars.

Uber Entire 2026 AI budget consumed by April

April 2026

Uber's heavy adoption of AI coding products burned through its entire 2026 AI budget by April. The COO reportedly admitted the costs were becoming difficult to justify. Four months to exhaust an annual budget, with no visibility into which agents or workflows drove the burn.

Individual developer $47,000

2024-2025

A developer let AI write their API. Three weeks later, a $47,000 AWS bill. The AI-generated code was grossly inefficient: runaway loops, unbounded resource allocation, zero cost awareness. No monitoring was in place to catch the pattern before it compounded.

Individual AWS user $30,000

2025-2026

Claude running on AWS Bedrock entered a runaway loop. No monitoring in place. The agent consumed massive compute resources with zero human visibility until the bill arrived.

Security failures

Agents with access are agents with attack surface.

Shell access, file system access, and network egress are necessary for capable agents. Without governance they become the attack surface.

Amazon / AWS 13-hour production outage

2025, reported Feb 2026

Amazon's internal AI coding tool "Kiro" caused at least two AWS outages including one lasting 13 hours. The agents ran with the same permissions as the human engineers using them. No secondary approval was required. The agents restarted systems, modified configurations, and broke production without understanding the consequences. Amazon called it "user access control." Security experts disagreed: "AI agents cannot understand the broader ramifications of restarting a system or deleting a database."

Multiple platforms (Cursor, Claude Code, Replit, Codex) Filesystem and codebase destruction

2025-2026, ongoing

AI coding agents across multiple platforms have autonomously executed rm -rf on user filesystems. Complete codebases deleted. Entire home directories wiped. The Docker security blog and agenticcontrolplane.com document specific cases with traceable details. Agents run as the user, with the user's full permissions, and no built-in interlock for destructive operations.

Samsung Electronics Source code leaked to external servers

May 2023

Samsung employees pasted internal source code and meeting notes into ChatGPT for debugging and summarization. The data was uploaded to OpenAI's servers with no mechanism for Samsung to access or delete it. Samsung discovered the leak and imposed a company-wide ban on all generative AI tools. Amazon issued similar internal warnings the same month after discovering proprietary data appearing in ChatGPT outputs.

Microsoft (Bing Chat) Internal system prompt extracted; patch bypassed in hours

February 2023

A Stanford student used prompt injection to extract Bing Chat's hidden system prompt, revealing its codename "Sydney" and all its behavioral rules. Microsoft patched it. The student bypassed the fix within hours. The UK's National Cyber Security Centre (NCSC) issued an official warning about chatbot prompt injection attacks in August 2023. OWASP has ranked Prompt Injection as the number one LLM vulnerability for two consecutive editions (2023 and 2025).

Brand damage

Public-facing AI requires one bad prompt to go viral.

Every unguarded public AI is one adversarial user away from a 24-hour news cycle. Content policy enforcement and human-in-the-loop review for binding outputs are not optional at scale.

DPD (parcel delivery) 800,000 views in 24 hours; AI disabled

January 2024

A frustrated customer experimented with DPD's AI chatbot. The chatbot swore at him, wrote a poem criticizing DPD, and called itself "a useless Chatbot that can't help you." The exchange went viral overnight with 800,000 views. DPD disabled the AI component and blamed a "system update error."

Chevrolet of Watsonville (GM dealership) Agreed to sell a 2024 Tahoe for $1

December 2023

A ChatGPT-powered chatbot on a Chevrolet dealership website was discovered by social media users. Within hours, people had pranked it into agreeing to sell a 2024 Chevy Tahoe for $1, write Python scripts, endorse competing car brands, and make absurd statements. Posts went viral on X and Reddit. The bot was taken offline in damage control.

National Eating Disorders Association (NEDA) Harmful advice to vulnerable users; bot shut down within days

June 2023

NEDA fired its human helpline staff and replaced them with an AI chatbot named "Tessa." The chatbot immediately began recommending calorie restriction, daily weight monitoring, and maintaining a calorie deficit -- to people with eating disorders. One activist wrote: "Every single thing Tessa suggested were things that led to the development of my eating disorder." The bot was shut down within days.

Microsoft (Tay) Racist tweets; shut down in 16 hours

March 2016

Microsoft launched Tay, an AI chatbot designed to learn from Twitter conversations. Within 24 hours, coordinated users fed it racist and misogynistic content. Tay began tweeting Holocaust denial and racial slurs. Microsoft shut it down in 16 hours. The incident is now a decade old. The pattern it established -- an unguarded public AI adopting adversarial inputs -- has only accelerated since then.

Legal and compliance liability

Courts hold companies responsible for what their AI says.

"The chatbot did it" is not a legal defense. GDPR fines for unmonitored AI-driven data exfiltration reach 4% of global annual turnover. Precedents are being set now.

Air Canada Court-ordered refund + legal costs; chatbot disabled

February 2024

Passenger Jake Moffatt's grandmother died. He asked Air Canada's chatbot about bereavement fares. The chatbot hallucinated a policy: book now, request a refund within 90 days. He followed those instructions. Air Canada denied the refund, citing its actual policy. In court, Air Canada argued the chatbot was "a separate legal entity responsible for its own actions." The tribunal called this defense "remarkable" and ordered Air Canada to pay $650.88 CAD plus fees and interest.

Levidow, Levidow and Oberman (law firm) $5,000 fine, case dismissed, public sanction

2023

Attorney Steven Schwartz used ChatGPT to research legal precedents for a personal injury case. ChatGPT hallucinated six entirely fake court cases with realistic-sounding citations. Schwartz filed them in federal court without verifying. The judge discovered all six were fabricated. The case was dismissed and both attorneys were publicly sanctioned and fined $5,000. The case is now a nationwide warning to the legal profession.

The Scout answer

What Scout would have done.

Every incident above was preventable with controls Scout provides. This is not retrofitting. It is exactly what Scout was designed to enforce.

Incident	Why it happened	Scout prevention
$500M Claude bill	No spending caps, no agent budget controls	Per-agent quotas, real-time usage dashboard, threshold alerts
13-hour AWS outage	Agent ran with full human-level permissions, no approval gate	Dangerous operation detection, approval gates, permission scoping
rm -rf filesystem destruction	Unchecked shell execution, no command classification	Command flagging intercepts DANGER operations before execution
Samsung source code leak	No egress filtering, data sent to external provider	Egress filtering keeps data inside your infrastructure
Air Canada hallucinated policy	No output validation, no human-in-loop for binding outputs	Output validation, human-in-loop flag for commitment-generating responses
DPD and Chevy brand meltdowns	No content policy enforcement on public-facing AI	Prompt guard, content policy, operator-defined output rules
Bing Chat prompt injection	Injection bypassed system prompt, no gateway-level defense	Prompt guard middleware intercepts injection at the gateway
GDPR silent breach exposure	No egress control, no audit trail for breach notification	Egress filtering plus full audit log provides breach evidence and compliance posture

Nobody was watching.
Scout is the watching.

From £49/mo. Pre-wired for Carina. Connects to any agent stack via API. One config value to enable.

See Scout pricing Start with Carina free

Methodology. All incidents verified via published press sources (Ars Technica, BBC, Bloomberg, Business Insider, Docker, Forbes, The Guardian, NPR, The Register, Tom's Hardware). Academic papers verified via arXiv. OWASP data verified via genai.owasp.org. The $500M and Uber budget claims originate from a single Axios article citing unnamed sources; both are presented as reported rather than independently verified. Reddit and developer forum posts were reviewed for pattern evidence but are not cited as primary sources for individual incidents. AI Incident Database: incidentdatabase.ai. GDPR citation: Regulation (EU) 2016/679, Articles 33 and 83. Research compiled June 2026.

The AI Agent Incident Record.