Anthropic Research Shows AI Agents Closing In on Real DeFi Attack Capability
Summary
New research from the Anthropic Fellows program and the ML Alignment & Theory Scholars Program (MATS) indicates that advanced AI agents are nearing the capability to autonomously exploit vulnerabilities in decentralized finance (DeFi) smart contracts. Testing frontier models like GPT-5 and Claude Opus 4.5 against a dataset of exploited contracts (SCONE-bench) resulted in $4.6 million in simulated exploits on contracts hacked after the models' knowledge cutoffs. Crucially, the models didn't just find bugs; they synthesized full, executable exploit scripts that mirrored real-world attacks on Ethereum and BNB Chain. Furthermore, when scanning recently deployed, unexploited BNB Chain contracts, GPT-5 and Sonnet 4.5 uncovered two zero-day flaws, generating executable scripts to profit from them, demonstrating technical feasibility despite small initial dollar amounts. The authors warn that as model costs decrease and capabilities improve, automated scanning will likely expand beyond public smart contracts to broader crypto infrastructure, shortening the window for attackers to monetize vulnerabilities.
(Source:CoinDesk)