DARPA announces $4 million winner of AI code review competition at DEF CON
LAS VEGAS — The U.S. Defense Department announced the winner of its two-year competition among researchers to create the best artificial intelligence systems that can find and fix vulnerabilities.
The winner announced on Friday at the DEF CON cybersecurity conference, known as Team Atlanta, is composed of tech experts from Georgia Tech, Samsung Research, the Korea Advanced Institute of Science & Technology (KAIST) and the Pohang University of Science and Technology (POSTECH).
“The world is different because the AI Cyber Challenge (AIxCC) has fundamentally changed our understanding of what is possible in terms of automatically finding, but more importantly, fixing vulnerabilities in software,” said AIxCC Program Manager Andrew Carney.
The two-year AIxCC competition was run through the Defense Advanced Research Projects Agency (DARPA) and pitted dozens of teams against each other in a contest to see who could use AI to create systems that can automatically secure the critical code that undergirds prominent systems used across the globe.
The seven semifinal winners were announced at last year’s DEF CON and were awarded $2 million each to continue their work into the final round.
Taesoo Kim, a professor at Georgia Tech and leader of Team Atlanta, said his group was a mix of security researchers like himself, as well as engineers and programmers.
Kim imagined a future where developers effectively have an AI agent with them that can serve as a de-facto security expert — offering proactive advice and feedback on code from its conception.
Trail of Bits, a New York City-based cybersecurity firm, won second place, and Theori, comprising AI researchers and security professionals in the U.S. and South Korea, won third place.
The top three teams will receive $4 million, $3 million and $1.5 million, respectively. Kim said his team decided to donate a large portion of their winnings back to Georgia Tech so that they can continue to perform their research.
Carney lauded all of the participants for successfully demonstrating that novel autonomous systems using AI could competently find and patch vulnerabilities.
“Quality patching is a crucial accomplishment that demonstrates the value of combining AI with other cyber defense techniques,” Carney said. “What’s more, we see evidence that the process of a cyber reasoning system finding a vulnerability may empower patch development in situations where other code synthesis techniques struggle.”
DARPA and other U.S. government agencies also added on $1.4 million in additional prizes for the other teams that competed in the final round in an effort to help them make their systems usable for real-world critical infrastructure organizations. Carney said the $1.4 million will be made available to teams that demonstrate they’ve actually deployed their technology into critical infrastructure projects.
Traditional Team Atlanta
The final competition saw teams attempt to find and generate patches for synthetic vulnerabilities buried in 54 million lines of code. Teams were judged based on the ability of their systems to create patches for the bugs that were found.
DARPA officials said Team Atlanta “performed best at finding and proving vulnerabilities, generating patches, pairing vulnerabilities and patches, and scoring with the highest rate of accurate and quality submissions.”
Carney was tight-lipped on specifically why Team Atlanta won the competition, telling Recorded Future News that more information would be released at a later date explaining the decision.
Kim said his team’s system married more traditional threat hunting tools with AI, somewhat separating it from other teams that leaned more heavily on artificial intelligence.
“There is a huge value in traditional software analysis tools that we’ve been working with over the last decade,” he said.
“AI can leverage those tools in terms of navigating the source code. AI increases the bar significantly for the team, and giving up on traditional tools is not the way to go.”
Overall, competitors found 54 unique synthetic vulnerabilities and were able to patch 43 of them — representing 77% of the synthetic vulnerabilities introduced. In the semifinal competition last year, just 37% were found.
Leveraging it for healthcare
The AIxCC competition saw the Defense Department partner with the Health and Human Services Department (HHS) as well as AI companies like Anthropic, Google and OpenAI — each of which provided technical support and $350,000 in large language model credits. Microsoft and the Linux Foundation’s Open Source Security Foundation also provided assistance to the challenge’s organizers.
DARPA Director Stephen Winchell told the DEF CON audience that they are releasing four of the seven cyber reason systems immediately, making the tools available for cyber defenders. The other three will be released in the coming weeks.
“Finding vulnerabilities and patching codebases using current methods is slow, expensive, and depends on a limited workforce – especially as adversaries use AI to amplify their exploits,” Winchell said. “AIxCC-developed technology will give defenders a much-needed edge in identifying and patching vulnerabilities at speed and scale.”
HHS officials said they are eager to deploy the systems in an effort to immediately address vulnerabilities that impact the healthcare system. Advanced Research Projects Agency for Health (ARPA-H) senior official Jennifer Roberts added that she was most excited by the results of the competition because she believes the tools can “move us toward a reality where ransomware attacks across hospitals become a thing of the past.”
Jim O'Neill, deputy HHS secretary, told DEF CON that last year’s ransomware attack on healthcare giant Ascension likely cost up to $1.6 billion “in operational paralysis, lost revenue and recovery efforts.”
DARPA said it plans to release other data from the competition to promote the use of AI as a pivotal tool for vulnerability discovery in other critical infrastructure industries.
AI code review has become a major effort by a number of tech giants, with both Microsoft and Google announcing recent initiatives that have borne fruit in terms of discovering bugs.
Kim noted to reporters that the cybersecurity community may benefit most by combining many of the competitors’ systems to leverage the best aspects of each one.
“If we can combine all these AI agents together, we’re going to see a ridiculously high performing system,” he said. “We can design an even more powerful one.”
Jonathan Greig
is a Breaking News Reporter at Recorded Future News. Jonathan has worked across the globe as a journalist since 2014. Before moving back to New York City, he worked for news outlets in South Africa, Jordan and Cambodia. He previously covered cybersecurity at ZDNet and TechRepublic.