Micro Focus is now part of OpenText. Learn more >

You are here

You are here

The state of machine learning and the SOC: Leverage the data deluge

Jaikumar Vijayan Freelance writer

Machine learning-powered tools promise to transform threat detection and threat hunting capabilities in security operations centers (SOCs).

Industry experts see ML as helping SOCs automate and improve analysis of event and incident data gathered from enterprise security devices and myriad other network-connected systems.

The massive and growing volumes of data from these devices in recent years have made it increasingly difficult for security operations (SecOps) teams to detect, triage, prioritize, and respond to threats—resulting in heightened risk exposure. Traditional security information and event management (SIEM) systems and other alerting mechanisms that use static rules and thresholds, while effective against known threats, have run into challenges with new, low, and targeted attacks.

In a survey that Crowd Research Partners conducted last year, more than half of the respondents (55%) cited their inability to detect advanced threats as the biggest challenge for SOCs.

Here's how machine learning is changing the game on threat detection for the SOC—and what your SecOps teams needs to know to put it to work for your organization.

Dealing with the deluge

The trend toward rapidly mounting alert data has heightened the need for an automated way for organizations to quickly sift through and identify deeply hidden threats. This needs to happen both by using static rules and by looking for deviations from normal behavior in the traffic.

Mario Daigle, VP of products at Interset, a network behavioral analytics company that Micro Focus recently purchased, said there’s simply too much security data in the average enterprise for a human analyst team to analyze quickly enough.

"Machine learning can accelerate discovery through the analysis of this data, distilling billions of events into a shorter list of threats."
Mario Daigle

ML-enabled threat detection tools are designed to analyze large volumes of data and to "learn" to recognize the patterns associated with normal behavior and the patterns associated with different threats.

Supervised and unsupervised ML

Some ML models are "supervised," while others are "unsupervised." Supervised ML involves learning by example from an existing dataset and then applying that knowledge to new data. For instance, by analyzing data associated with known malware traffic, a supervised ML tool learns how traffic deviates from normal so it can recognize the same pattern in new data without being explicitly programmed.

An unsupervised ML-powered tool works by observing traffic over a period of time, learning what "normal" behavior on the network looks like and investigating deviations from that baseline. Unsupervised ML "really shows its power" when it comes to finding insider threats, advanced persistent threats, and other targeted attacks, Daigle said.

Unsupervised ML doesn't rely on rules and thresholds, but instead learns continuously and automatically based on patterns within the data, and at scale. That means that it can create a baseline for "normal" behavior for each entity in an organization instead of applying the same baseline to all.

The ability to make that kind of distinction can help improve threat detection and reduce false positives, he said.

"What is normal behavior for Sheila in IT may not be normal behavior for Bob in human resources."
Mario Daigle

The use case that's driving demand

ML's primary use case in threat detection is for automated identification of activity that deviates from a baseline, said Daniel Kennedy, an analyst at 451 Research.

"Alert fatigue and false positives are an issue for SecOps teams, and anything that moves toward alerting based on behavioral anomalies, in addition to known bad traffic, provides an additional point of detection."
Daniel Kennedy

Current estimates of the demand for ML- and artificial intelligence-enabled products for addressing cybersecurity challenges tend to vary widely, but most point to strong growth over the next few years.

According to Progressive Markets, demand for AI-powered cybersecurity tools (of which ML is a subset) will top $11 billion by 2025. Market Research Engine, on the other hand, expects that challenges posed by the IoT, mobile malware threats, and regulations will drive demand to over $35 billion by 2024.

For most typical SecOps teams, leveraging ML techniques will happen within a specific product, Kennedy said. As one example, he pointed to ML being applied to user and entity behavior analytics (UEBA) within a SIEM.

Watch out for the hype (and other caveats)

Organizations must be wary about the hype surrounding the use of ML and AI approaches in threat detection, Kennedy warned.

Machine learning certainly has potential in threat detection, he said. But he noted that there was the normal hyperbole to be expected. 

"There's a good deal of marketing fluff floating around at the moment. So teams should keep in mind the need to test these products with their own data to validate claims about detection."
—Daniel Kennedy

There are other caveats that organizations need to keep in mind when looking to exploit AI- and ML-driven capabilities in the SOC. One of them is the misconception that these new approaches supersede or replace signature- and correlation-based detection technologies that have already proved effective at what they are designed to do, said Chas Clawson, former product marketing manager of ArcSight ESM at Micro Focus.

These existing tools are more easily implemented, require fewer computing resources, and are still effective at detecting indicators of compromise.  

Once an organization has a mature log aggregation and correlation program, it can start to layer on more advanced behavior and ML methods of detection to find those previously undetected threats, Clawson said. Only approaches that combine AI with human analysis and validation can scale fast enough to keep pace with the demands of investigating modern threats, he said.

More factors to consider

Organizations exploring ML-driven security products also need to ensure that the training data they have is clean and large enough to represent a known normal, Clawson said.

"Throwing algorithms at problems to see what sticks isn't effective and often requires in-house data science expertise."
Chas Clawson

Daigle points to other key success contributors when it comes to leveraging ML and AI in threat detection. One of them is the need to prioritize targeted use cases for the technology and to document desired outcomes before phasing in the use of ML.

"Understand your target use cases, and then select the right machine learning for the job," Daigle said. "Be selective when choosing a technology partner, and ask questions to understand what type of machine learning they are using and how it will address your unique needs."

Like Clawson, Daigle emphasizes the need for organizations to embrace an approach that combines ML with human analysts. Use the technology to automate data analysis, and rely on security professionals to apply the context needed to fully understand and investigate a potential threat, he said.

Use the tools wisely

For SOCs, ML- and AI-based security tools hold the potential to deliver substantial improvements in threat hunting and detection. The key is not to get distracted by the hype, to understand how these technologies work, and to identify what makes best sense for your particular environment.

ML-enabled tools, like any other security product, is not a silver bullet for all security woes. Not all ML is created equal, and many organizations are leveraging a type of ML that isn’t ideally suited for their needs, Daigle said.

"One of the biggest challenges is matching expectations with reality."

Keep learning

Read more articles about: SecurityInformation Security