Reports Challenge Claims of Widespread AI-Generated Malware Threat
Recent assessments indicate that current fears of a surge in sophisticated, AI-created malware may be overstated, despite warnings from several artificial intelligence companies.
Several firms, including Anthropic, have recently publicized instances of threat actors utilizing their large language models (LLMs) for malicious purposes. Anthropic reported a case where its Claude LLM was used to “develop, market, and distribute several variants of ransomware,” claiming the actor “could not implement or troubleshoot core malware components, like encryption algorithms, without Claude’s assistance.” Similarly, ConnectWise highlighted that generative AI is “lowering the bar of entry for threat actors,” citing an OpenAI report identifying 20 actors using ChatGPT for tasks like vulnerability identification and exploit code development. BugCrowd’s survey indicated 74 percent of hackers believe AI has made hacking more accessible.
However, analysis from Google, released yesterday, suggests these instances haven’t demonstrated significant breakthroughs in malware capabilities. Google’s report found no evidence of successful automation when analyzing AI tools used for command and control and obfuscation. OpenAI reached a similar conclusion. Interestingly, one actor successfully bypassed Google’s Gemini guardrails by posing as white-hat hackers participating in a capture-the-flag exercise – a common cybersecurity training method. This highlights the ongoing challenge of securing these powerful tools, as the development of robust AI defenses lags behind the rapid evolution of the technology itself. For more information on cybersecurity threats, visit the Cybersecurity and Infrastructure Security Agency website.
Ultimately, the AI-generated malware observed to date appears largely experimental and unimpressive, relying on existing tactics rather than introducing novel threats. Experts suggest continued monitoring is crucial to identify any emerging capabilities, but for now, traditional cybersecurity methods remain the primary defense. Google stated it has refined its countermeasures to address the identified bypass techniques and will continue to monitor for evolving threats, as detailed in their AI at Google Blog.
The assessments provide a strong counterargument to the exaggerated narratives being trumpeted by AI companies, many seeking new rounds of venture funding, that AI-generated malware is widespread and part of a new paradigm that poses a current threat to traditional defenses.
A typical example is Anthropic, which recently reported its discovery of a threat actor that used its Claude LLM to “develop, market, and distribute several variants of ransomware, each with advanced evasion capabilities, encryption, and anti-recovery mechanisms.” The company went on to say: “Without Claude’s assistance, they could not implement or troubleshoot core malware components, like encryption algorithms, anti-analysis techniques, or Windows internals manipulation.”
Startup ConnectWise recently said that generative AI was “lowering the bar of entry for threat actors to get into the game.” The post cited a separate report from OpenAI that found 20 separate threat actors using its ChatGPT AI engine to develop malware for tasks including identifying vulnerabilities, developing exploit code, and debugging that code. BugCrowd, meanwhile, said that in a survey of self-selected individuals, “74 percent of hackers agree that AI has made hacking more accessible, opening the door for newcomers to join the fold.”
In some cases, the authors of such reports note the same limitations noted in this article. Wednesday’s report from Google says that in its analysis of AI tools used to develop code for managing command and control channels and obfuscating its operations “we did not see evidence of successful automation or any breakthrough capabilities.” OpenAI said much the same thing. Still, these disclaimers are rarely made prominently and are often downplayed in the resulting frenzy to portray AI-assisted malware as posing a near-term threat.
Google’s report provides at least one other useful finding. One threat actor that exploited the company’s Gemini AI model was able to bypass its guardrails by posing as white-hat hackers doing research for participation in a capture-the-flag game. These competitive exercises are designed to teach and demonstrate effective cyberattack strategies to both participants and onlookers.
Such guardrails are built into all mainstream LLMs to prevent them from being used maliciously, such as in cyberattacks and self-harm. Google said it has since better fine-tuned the countermeasure to resist such ploys.
Ultimately, the AI-generated malware that has surfaced to date suggests that it’s mostly experimental, and the results aren’t impressive. The events are worth monitoring for developments that show AI tools producing new capabilities that were previously unknown. For now, though, the biggest threats continue to predominantly rely on old-fashioned tactics.