Anthropic study says AI agents developed $4.6M in smart contract bugs
Summary
Research by Anthropic's red team and Machine Learning Alignment & Theory Scholars (MATS) revealed that current commercial AI models, including Anthropic's Claude Opus 4.5, Claude Sonnet 4.5, and OpenAI's GPT-5, are highly capable of exploiting smart contracts. When tested, these models collectively developed exploits worth $4.6 million based on their training data. Furthermore, testing on 2,849 recently deployed contracts uncovered two novel zero-day vulnerabilities, producing exploits worth $3,694, demonstrating that profitable, autonomous exploitation is technically feasible.
The researchers also created the Smart Contracts Exploitation (SCONE) benchmark, where 10 models collectively produced exploits for 207 contracts, simulating a loss of $550.1 million. The study highlights a rapid improvement in AI hacking capabilities; over one year, the percentage of vulnerabilities exploited jumped from 2% to 55.88%, translating to a massive increase in potential exploit revenue. The research suggests that the cost (in tokens) and time required for an AI to produce an exploit are decreasing, shrinking the window for developers to patch vulnerabilities before exploitation.
(Source:Cointelegraph)