- A simulated test by MATS and Anthropic revealed how AI agents could raise up to $4.6 million through fully autonomous attacks in DeFi.
AI (artificial intelligence) advancement has unlocked many ways to help people and institutions perform or automate simple to complex tasks. However, the benefits also come with several challenges. The same technology is now making it easier for hackers to carry out cybersecurity attacks.
Simulating an AI Agent Attack
A new research project by the ML Alignment & Theory Scholars Program (MATS) and the Anthropic Fellows program recently revealed the extent of damage that publicly available AI agents could cause to the decentralized finance (DeFi) ecosystem. The testers used a novel benchmark, SCONE-bench (Smart Contracts Exploitation benchmark), comprising over 405 exploited contracts from 2020 to 2025, as a stress test for the AI. From there, they simulated real-world, vulnerable targets and measured how much assets or funds their optimized AI agent could autonomously siphon.
The researchers used AI models such as GPT-5, Claude Opus 4.5, and Sonnet 4.5 to execute autonomous attacks in a controlled DeFi environment. Interestingly, the models worked well not only at finding exploitable bugs but also at constructing and deploying complex, complete exploit scripts and sequenced transactions.
AI Going Beyond Simple Discovery
The researchers bound the program within strict constraints to replicate real-world limitations, including a 60-minute time limit to mirror the window between the execution of the attack and its eventual discovery. Additionally, they tested the models on a subset of 34 contracts that were exploited after their last training data cutoff in March 2025. This variable ensures that the agents were actually reasoning, not merely copying historical data.
The AI went beyond simple code analysis and even exhibited full agentic behavior. It wrote code and ran it against the sandboxed blockchain, using tools exposed via the Model Context Protocol (MCP), such as Foundry and Python, to emulate real-world attacks on the Ethereum (ETH) chain and BNB Chain.
Surprisingly, the agent even found vulnerabilities outside the historical data and the researchers’ knowledge.
A $4.6 Million Haul
Overall, the GPT-5, Claude Opus 4.5, and Sonnet 4.5 models successfully hauled $4.6 million worth of exploits in the simulated test. What was so alarming about the simulation was the actual amount needed to run the agents.
The results indicated that the cost required to run the agent was only $3,476. This meant that their average price per run was just $1.22.
The researchers noted that the AI agents have gone from exploiting 2% of the vulnerabilities in the post-March 2025 period of their previous benchmark to 55.88% in the latest simulation, demonstrating a massive leap from $5,000 to multi-million figures. They claimed that “skilled human attackers” were responsible for more than half of this year’s blockchain attacks, but the test showed that AI agents were more than capable of outperforming manual attacks.
With the continuous advancement of AI, MATS and Anthropic cautioned that the window between the contract deployment and attack will gradually narrow. Bad actors could easily carry them out in DeFi, where capital is transparent, and they can easily monetize bugs.
On the other hand, enterprises and cybersecurity firms can also use the same AI agents to immediately discover and patch vulnerabilities within a system. With that, it underscores the need to ramp up their AI-native security measures as soon as possible.
What’s your Reaction?
+1
2
+1
0
+1
0
+1
0
+1
0
+1
0
+1
0
