You are here

Darpa Cyber Grand Challenge recap: How bots will help lift the security game

public://pictures/2538f31.jpg
Johanna Curiel, Co-founder, Ossecsoft

This year, DEF CON 24 attendees in Las Vegas had the opportunity to witness the conclusion of an exciting DARPA Cyber Grand Challenge: a fierce competition among 100 cyber reasoning systems (CRSs) that started back in 2013. The top seven CRS finalists exhibited some amazing skills programmed by their creators. For DEF CON attendees, the last hours of the challenge were experienced through a visual display showing how these machines were finding, attacking, and patching bugs all by themselves.

The indisputable winner, Mayhem, a CRS created by ForAllSecure of Pittsburgh, displayed very interesting behavior. After performing well in the first part of the event, it then crashed and did nothing for most the last part of the competition, until it recovered, seemingly from nowhere, and got back into the game. In fact, Mayhem had been performing so well that it had accumulated enough points that the other bot teams could not catch up even after it stopped competing.

 

Despite Mayhem's enormous edge over the other competitors, it was not alone in putting in a remarkable performance. For example, Mechanical Phish, the third-place bot, was the only contestant to discover the Crackaddr bug. The Mechanical Phish team, called Shellphish, mostly consisted of a group of international students from the University of California, Santa Barbara, but included a high school whizkid not even old enough to drink. Xandra, which finished in second place, found a bug that the DARPA Challenge designers didn't even know about. Xandra was built by the TECHx team of Ithaca, New York, and Charlottesville, Virginia. Rubeus, from a team called Deep Red in Arlington, Virginia, had a promising beginning but fell to fourth place due to a patching that damaged the performance of the system that it was trying to secure.

Here are the top takeaways from the challenge.

The State of Application Security in the Enterprise

Bots might be able to strategize

This event proved the amazing potential of automated systems and their capabilities for autonomous reasoning. During the 97-round challenges, the bots behaved in unpredictable ways. Mayhem, for example, was the only bot built with the capability of weighting the risk of patching against performance—and that may have helped it avoid Rubeus' fate. Part of the challenge involved patching vulnerabilities on the systems that the bots had to safeguard, but patching required taking a system down, which carries a risk of a crash and therefore a high chance of losing points. Mayhem, then, may have remained silent at the end of the competition because it concluded that the risks of patching or being hacked by the other bots were too high. 

The Mechanical Phish team also indicated that their creation had become independent of them. On the last day of DEF CON, the Shellphish team told the story of their bot's development in great detail. Racing against time, the team was making last-minute code changes but finally became frustrated with dealing with test cases and verifying patching capabilities. At that point, they decided to cross their fingers and let the CRS make its own decisions—as they explained it, it was like raising a child and finally hoping that all you have taught her will be sufficient as she confronts problems by herself.

[ Webinar: Get Started with Seamless App Sec in a Single Day (Jan. 23) ]

Bots reflect their creators

The creators, maybe without being aware of it, put human characteristics into their bots that seemed to reflect themselves. Mayhem's programmers, all experienced cybersecurity researchers, used a more systematic and methodical strategy than the young Shellphish team. Mayhem exhibited cool and calculating behavior during the competition—it made one think of HAL 9000 from 2001: A Space Odyssey. If Mechanical Phish reminded anyone of a movie robot, it might have been those from WALL-E or Chappie.

The components used in the CRSs are also interesting. The Mechanical Phish team based their programming on an open-source exploit framework being developed at UC, Santa Barbara, called Angr, and a fuzzer called Driller, as well as other open-source components. Xandra used Peasoup, an automatic software-hardening technology that integrates symbolic execution. The Mayhem team used two major components: the Mayhem symbolic executor and a directed fuzzer they called Murphy. For defense, they used a hot patching method, combined with risk weight analysis, and named this system Major Tom. They used a component called Captain Crunch to estimate scores. The military echoes in the team's naming system conventions are interesting.

Will bots help us get better at information security?

What can we deduce from these methodologies, and how can they change and lift our security game? First of all, being disciplined pays high rewards. The Mayhem team went home with $2 million, and their bot showed the most sophisticated defense performance. Their approach demonstrated that being organized pays off. On the other hand, a focus on exploitation and a more liberal modus operandi led the Mechanical Phish team to accomplish something that not even Mayhem with its military strength could do: find complex vulnerabilities in the system. And Xandra's CRS, using a combination of technologies that integrated complex mathematical analysis, was able to make a discovery that not even the human designers of the challenge were aware of.

But while it's true that these bots have proved that autonomous systems will play a big role in cybersecurity in the future, CRSs can't yet match the capabilities of human hackers and penetration testers. But such promising results guarantee that there will be more research and development in this area. The Shellphish team will soon release their bot source code through Github, a move that will certainly motivate innovation and the eventual incorporation of more effective ways to secure systems.

Share your thoughts or reflections on the DARPA Grand Challenge in the comments below. I'd love to hear them.