The cybersecurity industry, much like other sectors, is dealing with an influx of data. In response, security experts hope to harness the power of artificial intelligence and machine learning to effectively and efficiently augment their ability to detect, predict, and contain threats to their networks. The Cipher Brief spoke with Shezhad Merchant, the Chief Technology Officer of Gigamon, about where the cybersecurity industry currently stands, and what role machine learning and artificial intelligence can play in enhancing the industry’s capabilities to defend sensitive networks against a progressively complex and numerous pool of malicious actors.
The Cipher Brief: What is the general state of the cybersecurity industry at the moment?
Shehzad Merchant: As an industry, we find ourselves at a point where the advantage today lies in the hands of the attacker. However, we have a unique opportunity in a pivotal, perhaps historical, moment to reverse that and restore the advantage to the defender. We can shift the paradigm by focusing inside the network infrastructure, with the assumption that there is no such thing as “secure” anymore, and start applying machine learning and artificial intelligence (AI) techniques. We actually may be at a time where the advantage is flipped back to the defender.
TCB: So why do attackers currently hold the advantage over defenders in cybersecurity?
SM: One factor is really the speed of information. We often talk about volumes of information—lots of data and varieties of data—but we don’t quite talk about the speed of information. This is critical. When we think about networking technologies today, there are a hundred gigabit networks being deployed. On a hundred gigabit network link, the time from the first packet to the next is about 6.7 nanoseconds, or 6.7 billionths of a second speed at which data is going on here in our infrastructures. This does not really allow much time to go do anything intelligent or meaningful with the data, particularly when looking at application data, and determine whether that data has any malware embedded within it. Consequently, a lot of the threats are going to slip through because organizations don’t have any time for effective analysis. This is one major reason attackers have the advantage.
Another reason is the democratization of malware. Consider an advanced persistent threat or the cyber attack kill chain, which is basically a sophisticated attack that occurs in stages. It used to be that in order to craft such an advanced attack, malicious actors actually had to be a brilliant coder who put together all the pieces of the cyber attack kill chain. Today, malicious actors don’t have to be brilliant, they can simply go to the dark web and leverage different pieces of that attack cycle and piece it together. They don’t have to create a zero-day exploit for a previously unknown vulnerability, they don’t have to create an infrastructure for phishing and social engineering, and they don’t have to create a command and control infrastructure—these are all available for rent through the dark web. As a result, the number of malicious actors is increasing significantly because they can easily leverage the disparate pieces out there and put them together for sophisticated attacks. It is essentially a volume game, and is the reason why the number of breaches is increasing.
While malicious actors are leveraging frameworks that they can rent, the defenders are still stuck in a world of manual intervention. The industry’s processes and products are siloed, and IT staff doesn’t work well across boundaries—everything is manual today. The combination of these three facts—the speed of data, the democratization of malware, the fact that defenders are encumbered by manual processes—really give the advantage to the attacker.
TCB: In what areas can artificial intelligence augment these manual processes to automate them?
SM: In order to restore the advantage back to the defender, we must step back and rethink our security model. The new security model looks inside our infrastructure as part of detection strategy. Due to the fact that the advantage is with the attacker, breaches are inevitable. So companies must start with the assumption that there is no such thing as “secure” anymore—and that malicious actors are already in our infrastructure—then there is an opportunity to reverse the advantage and bring it back into the hands of the defender. Once organizations start looking inside their infrastructure, the attacker now has to take every step possible to evade detection, whereas the defender has to only find one footprint that leads to the attacker.
Suddenly, the advantage comes back to the defender.
So if organizations start reversing the paradigm and stop thinking of security as a model where they can keep malicious actors out, and rather start thinking about security as malicious actors are already in and we have to detect them, then they have a chance at actually restoring defense.
From here, there are three phases where organizations can put together a security model that actually brings the advantage back to the defender. The first step is detection, where organizations look across their entire infrastructure to identify anomalous behavior. This is the realm of machine learning. We often hear about security as a big data problem and this is why; it is because organizations have to look across their entire infrastructure—the virtual cloud, user devices, and applications—and apply machine learning techniques to go look through that large quantities of data to surface behavioral anomalies.
Once this first phase is complete, companies move to the second and try to predict where the malware is going to go to next. This is the realm of artificial intelligence technologies. Organizations say, “I have seen this kind of anomalous behavior before, I have learned about it, and based on this, I think the next steps that are going to happen are this, this, and this.” Once we can predict those next steps, then we can actually put in processes to contain them.
So once organizations reverse the mentality of “secure,” and start with the assumption that the adversaries are already inside their infrastructure, then they can begin applying the machine learning technologies for detection, AI-based technologies for prediction, and conduct rapid containment.
But all of this is predicated on the fact that organizations have access to all the right information inside their infrastructure, which means they must have pervasive visibility, otherwise this model fails. So there is an underlying foundational layer of visibility that enables this shift towards machine learning detection, predictive AI, and a containment-based paradigm.
TCB: Can artificial intelligence help with behavioral and forensic analysis to conduct attribution?
SM: Absolutely, artificial intelligence will play a key role in attribution. One central aspect of artificial intelligence is memory. When AI systems start seeing known bad behavior reoccur, it can very quickly point out that this is a framework previously seen and this is the malicious actor that developed it, so that organizations will be able to apply policies to mitigate that kind of behavior.
AI will play a strong role in this behavioral detection.
TCB: How does information sharing fall into this new paradigm?
SM: There are two vectors that are going to be very important to information sharing. Threat intelligence sharing is one. It is only when organizations see what known bad behavior is can they apply predictive techniques to determine the next steps in the attack cycle. So threat intelligence sharing is going to be a critical aspect of the whole AI solution.
The second aspect is determining what is normal behavior. So there is bad behavior and threat vectors, but what does the normal behavior from an organization look like? In other words, organizations need context into what their normal behavior is, and that requires looking inside their infrastructure and become very familiar with everyday behavior within their networks. Once organizations have both, they can triangulate user behavior against normal behavior and bad behavior within their systems. All of that data feeds into the AI solutions.