‘Replacing humans is not close’: BlockSec challenges EVMBench on AI auditing
Summary
Researchers at BlockSec have challenged the findings of EVMBench, an AI-powered smart contract auditing benchmark developed by OpenAI and Paradigm, arguing that its initial results were overly optimistic. While EVMBench reported high success rates in detecting and exploiting vulnerabilities, BlockSec’s re-testing with more configurations and real-world attack incidents showed a 0% exploit success rate. The researchers attribute this discrepancy to potential issues with the original testing conditions, including data contamination and a limited range of model configurations. BlockSec found that AI agents reliably detect well-known patterns but struggle with novel vulnerabilities, highlighting the continued need for human judgment in auditing. They emphasize that the future of smart contract auditing lies in human-AI collaboration, with AI handling broad scans and humans providing in-depth analysis and adversarial reasoning.
(Source:The Block)