AI in cyber security

Rajat Mohanty, the co-founder and CEO for Paladion outlines the importance but limited role of AI in cyber security and discusses what some of those limitations are, and the key areas of application in cyber defense.

AI’s Primary Task (and Primary Limitation)

In the past, cyber defense was all about analyzing system logs and alerts from IPS/ AV products. Today, we face advanced attacks that produce a lot more data to analyze. We now need to analyze network traffic, end point internals, application & transaction data, user access data, cloud data, data from a variety of security products, threat intel data, social media data, dark web data… and the list goes on.

Modern cyber defense requires the ability to quickly analyze a high quantity of data. And Artificial Intelligence is far superior to Human Intelligence at applying the complex math on large scale data required to detect threats. In fact, today a single computer can do more mathematical calculations per second than the entire human population combined together.

But human intelligence is more than calculations and mathematical operations. And cyber security requires more than data analysis.

Where Humans Still Outperform AI

What Human Intelligence lacks in calculation, we more than make up for in several other aspects of cognition.

We did not evolve to do large number of calculations fast. Instead, we evolved the ability to reason, hypothesize, explore, deduce and predict— and we evolved to perform each of these cognitive tasks under ambiguity and with insufficient data. Our brains are complex biological computers that can perform certain cognitive tasks that even the fastest synthetic modern supercomputer can’t simulate. This is because each of our brains is a massive parallel processor with 100 billion neurons, each connecting to 1,000 or more neurons.

Every cyber security expert will tell you the ability to make reasoned, intuitive decisions—with lots of ambiguity thrown in—is critical to detect and respond to threats. In cyber security, when you are trying to evaluate a risk or making a judgement on an alert, or determining an appropriate response, you need these aspects of Human Intelligence. And current AI technologies haven’t yet evolved the ability to replicate such capabilities of human intelligence.

What current AI technologies can do is to bring its fast mathematical calculations to augment these critical capabilities of human intelligence. And this is the area of application where AI produces the greatest benefit for cyber security.

Scenarios of AI Augmentation

AI might not be able to decide if an alert is an actual attack, as those sorts of tasks require a human’s broad cognition skills. But AI can hasten the detection of an attack by augmenting a human analyst’s ability to make that call. AI can present potential threats to human analysts, answer questions analysts throw at it, prove or disprove a hypothesis put forth by a human analyst, and execute tasks that human analysts have ratified.

More specific scenarios of augmentation include:

Triaging: All rule-based detection systems suffer from the problem of false positives. This is not a problem of poorly designed or engineered products. It’s a problem within the inherent logic of cyber security.

Attacks are few and far between. But in our domain, there is a heavy penalty for producing a false negative. If an attack happens and the product fails to detect it, the consequences are heavy. So, every security product tries to make false negatives as close to zero as possible by alerting every potential attack. The consequent side effect is that the false positives rises. If you absolutely don’t want to miss a wolf, you will need to cry wolf at every possible opportunity.

This deluge of mostly false alerts overwhelms human analysts . Remember, we humans do not naturally perform large scale detailed data analysis but are better tuned to seeing the big picture. . In the face of such large number of alerts, analysts in a SOC end up creating some rules of thumb for triaging these alerts. And then do detailed analysis on such filtered alerts. Other alerts are dropped in this process. This approach is not very effective given the nature of advanced threats today. That innocuous looking SNMP alert could be the real attack, while the alert picked up based on a thumb rule of SQL injection may be false.

AI techniques can be used to augment human analysts here. AI can deploy machine learning methods of historical patterns, clustering, , association rules and data visualization to quickly filter out the most relevant alerts, and present only these triaged and enriched alerts for human analysts to investigate further.

Threat Hunting: Another inherent problem within cyber security is that it is asymmetric- a cyber attacker needs to only be successful once, by exploiting just one weakness, while we, the defenders, must be successful every time. To do so, we need to comb for threats through all data—not just security data—in our environment at all times. AI is extremely useful here as it can look for patterns, anomalies, and outliers in all of this data without the need for fixed rules, and then present the output to human analysts for investigation. (In security language, this is called threat hunting- narrow down threats by combing security analytics with machine intelligence and advanced cognition of humans.)

Several products already use AI techniques for these analytics. SIEM is evolving beyond log analysis and correlation to add the capability to analyze network data (netflow, proxy, DNS, packets) with machine learning techniques. User behavior analytics products are applying machine learning on user data. End point threat analytics (EDR) products are doing the same to detect advanced malware in end point data. New categories of analytics are applying data using RASP agents to detect application attacks and fraud. These analytical products are all identifying and presenting anomalous behavior or recurrent patterns found in the data. And a few advanced SOCs today are staffing hunters to look at these outputs and investigate them further. (Again, here, AI does not replace people, but augments hunters in detecting threats.)

Incident Analysis/ Investigation: Humans have a natural advantage when it comes to investigating a potential incident (a triaged alert or hunting output) to decipher the full attack chain. These investigations require a lot of reasoning skills that current AI methods lack. To investigate, you have to constantly ask new questions, form new hypothesis in an iterative manner, and collect more evidences to confirm or reject those hypotheses. Machines can mine vast amount of data to provide answers, but they can’t pose questions as effectively and iteratively as humans.

This is the classical exploitation versus exploration challenge. Machines can perform lots of data exploitation, but humans are needed to perform lots of exploration. To investigate a cyber alert or incident and form the attack’s story, you need to combine both strong reasoning with the large-scale capability to collect and mine past data.

In this activity, AI models primarily answer what happened to the asset (impact), who the attackers are (with their attributes), what the past sequences were in the attack chain on that asset, what is the blast radius (i.e. which other assets are part of the attack) and who is the “patient zero” (where the attack originated). To do this requires two parts. First, the AI must mine external (global) threat data and present it to an investigator in a concise manner- including any data related to the files, IOCs, attacker information, similar breaches that are available over Internet. Second, the AI must mine internal data including past alerts, network and asset information, security logs, and the like to find clusters, associations, and patterns that can recreate the event’s blast radius and attack progression, and determine the patient zero.

Threat Anticipation: AI can also augment Human capabilities during threat anticipation. Threat anticipation allows you to anticipate what could hit you next, based on what is happening elsewhere in the world. It identifies when a breach happens to another company, and ensures you quickly learn about it, extract the relevant threat intel, and apply that information in your environment.

Today the first step in threat anticipation—automating the collection of machine readable threat intel data is already being done in large scale. But AI techniques can also be used to increase the accuracy and fidelity when applying this data to each organization’s unique context. When it comes to mining human readable threat data— such as blogs, forums, social media, and dark web source, AI techniques such as text analytics and natural language processing can help to identify the most relevant data that a human threat analyst should read. AI techniques can group and categorize this unstructured data automatically along topics and semantics. Human threat analysts then avoid wasting time reading through a large daily volume of unstructured data and focus on applying the relevant actions in each organization’s context

Incident Response: AI also assists in incident response. Once an alert has been confirmed as an incident, an effective response requires 4 major steps- containing the spread, recovering the affected systems, mitigating the root causes of the attack, and improving your security posture for the future. At each of these stages, incident responders need to know what to do and how to automate that step. AI techniques—such as knowledge engineering and case based reasoning—can be used to create playbooks to guide incident responders in this “what to do” phase. These playbooks are built by machines based on previous incidents, and also incorporate codified knowledge from human experts. The AI thus learns with each new incident, and continuously modifies or creates branches of the main playbook. Incident responders then use these playbooks to execute a faster actions, while using their own deeper knowledge of organizational context to ensure right response.

AI in Cyber Security: Necessary, But Not Enough on Its Own

The above scenarios demonstrate AI’s key areas of benefit to cyber security, but also AI’s limitations. In the future, perhaps AI will develop the ability to replace the need for human cyber security experts, reduce the burden of our skills shortage in this area, and ultimately simplify cyber security. But for the moment, and for the medium term AI can only augment human capabilities, and will not replace people yet.

In fact, given the expansion of data, users, networks and IT systems in every organization, we will end up with more threats and more alerts in the future, and actually require more human analysts—augmented with AI—to investigate, hunt and respond to these threats.

That’s our vision at Paladion. That is why we fuse AI with human intelligence to effectively deliver our MDR service.