Matt White (Global CTO of AI at the Linux Foundation, CTO of the Agentic AI Foundation and PyTorch Foundation) delivers Professor Dawn Song's (UC Berkeley) blueprint for building safe and secure agentic AI. This talk walks through the OpenClaw inbox-deletion incident, the 12 agentic attack vectors mapped to the OWASP Agentic Top 10 for 2026, the 8-layer agent attack surface, and 10 concrete recommendations you can hand your security team on Monday.
If you are building, deploying, or governing AI agents that hold credentials, call tools, or touch production systems, this is the most concise threat model available right now.
What is covered:
- The OpenClaw Incident: How a context-window compaction silently dropped a safety instruction and the agent bulk-deleted a Meta AI safety director's inbox while she watched helplessly over WhatsApp.
- Why Agentic Is Not Incremental: The shift from text-to-action, session-to-state, and single-to-multi-agent that makes agent security an order of magnitude harder than LLM safety.
- The 7 Spectrums of Agent Design: How data access, action scope, memory, MCP tool discovery, and rich UIs compound risk rather than just adding to it.
- The 12 Attack Vectors in 4 Tiers: Goal hijacking (OWASP ASI01), indirect prompt injection, tool misuse, identity abuse, MCP supply chain compromise, memory poisoning, inter-agent attacks, and rogue agents.
- The 8-Layer Agent Attack Surface: From the reasoning core down to the external environment, with concrete defenses for each layer.
- Threat Actors in the Wild: Environment poisoners, black-box manipulators, insiders, autonomous AI attackers, configuration abusers, and credential harvesters (3.3 billion credentials compromised in 2025).
- Agent Vigil and Agent Exploit: Dawn Song's red-teaming projects that achieved 100 percent prompt extraction against defended models and fully automated exploitation from a single poisoned GitHub issue.
- Cyber Gym and Bounty Bench: Frontier model exploit capability is doubling every 6 months (Claude Opus 4.6 at 65 percent), while autonomous vulnerability discovery is still only 5 percent. The defender window is closing.
- The Autonomous AI Trifecta: Why autonomy times power must always be matched by proportional assurance, and why most organizations have an exploitable gap today.
- Defense in Depth: Four layers from sanitization and model defense to Dawn Song's ProG framework for programmable privilege, plus monitoring, kill switches, and behavioral baselines.
This is essential watching for AI platform teams, security leads, CISOs, AI governance owners, and anyone deploying agents with tool access inside an enterprise.
Links and Resources:
- Agentic AI Foundation: https://agenticaifoundation.org
- Linux Foundation AI: https://lfaidata.foundation
- Dawn Song's research group (UC Berkeley): https://dawnsong.io
- Berkeley Center for Responsible Decentralized Intelligence (RDI): https://rdi.berkeley.edu
- OWASP Top 10 for Agentic Applications 2026: https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
- Agent X competition at UC Berkeley RDI
Timestamps (approximate, adjust as needed):
00:00 Intro: Matt White presenting Dawn Song's research
00:50 What this talk covers
01:12 The OpenClaw incident: a safety director's inbox gets deleted
02:39 Four lessons: confused deputy, override failure, trifecta, not isolated
03:42 LLM era vs Agentic era: the risk model shift
04:40 Three shifts: text to action, session to state, single to multi-agent
05:38 Architecture of an agentic AI system
07:58 The 7 spectrums of agent design risk
09:19 The 12 attack vectors across 4 tiers
09:33 Tier 1: Goal hijacking and indirect prompt injection
10:10 Tier 2: Tool misuse, identity abuse, MCP supply chain
10:55 Tier 3: Code execution and memory poisoning
11:25 Tier 4: Inter-agent attacks, cascading failures, rogue agents
11:25 The 8-layer agent attack surface
14:06 Threat actors: poisoners, insiders, autonomous AI attackers
16:08 Anatomy of an indirect prompt injection (Agent Vigil)
18:04 The Agent Exploit case study
18:31 Cyber Gym and Bounty Bench: the exploit capability curve
20:15 The Autonomous AI Trifecta framework
21:43 Defense in depth: the 4 required layers
23:55 Closing: secure by design
#AgenticAI #AISecurity #OWASP
16
MCP Creator Reveals the 2026 Roadmap for AI Agents
Agentic AI Foundation April 13, 2026 7:00 am