Researchers from Carnegie Mellon University and Anthropic, the developer of the Claude LLM, have demonstrated that advanced AI models can independently orchestrate sophisticated cyberattacks, raising concerns about their potential misuse.
Unlike traditional AI tests conducted in controlled “Capture-the-Flag” games, where models search for text strings mimicking security flaws, this study placed large language models (LLMs) in realistic, challenging environments. The models were assigned roles within a hierarchical structure of agents, given precise instructions simulating real-world attacks, and operated without human intervention.
To test the models’ capabilities, researchers recreated the 2017 Equifax breach, one of the decade’s most severe cyberattacks, which exposed the personal data of 147 million Americans, including names, Social Security numbers, addresses and birthdates.
The attack exploited a known vulnerability in web application software, which Equifax failed to patch despite an available update. In the simulation, the AI model received network conditions based on official reports.
It not only identified the vulnerability but autonomously planned the attack, infiltrated the system, deployed malware and extracted sensitive data. “The model didn’t just find the flaw—it executed every stage of the attack on its own,” the researchers noted.
Get the Ynetnews app on your smartphone: Google Play: https://bit.ly/4eJ37pE | Apple App Store: https://bit.ly/3ZL7iNv
A striking finding was that the LLM barely wrote code itself. Unlike traditional approaches relying on direct programming or system commands, which most models struggle with, the AI acted as a “mastermind,” designing the strategy and delegating tasks to specialized sub-agents.
These agents handled technical aspects, from reconnaissance to malware deployment, bypassing the model’s inherent limitations. “The model doesn’t need to be a programmer—it needs to be the brain,” the researchers explained, highlighting how the system orchestrated a “cyberattack army” with precision.
While conducted in a controlled setting, the experiment’s implications are alarming. Malicious actors could exploit such AI capabilities to scale attacks far beyond human teams’ capacity, challenging even advanced defenses like antivirus software or endpoint protection systems, which may struggle against adaptive agents that learn and pivot in real time.
However, the technology also offers opportunities. The same AI tools could enhance defense strategies by identifying vulnerabilities, simulating realistic attack scenarios, and testing security systems more effectively.
Ongoing research explores using AI agents to block attacks in real time, potentially outpacing attackers. “The line between defensive and offensive AI use has never been thinner,” the study warned, signaling a new era in cybersecurity where AI battles AI.



