Use Cases of Artificial Intelligence in Cybersecurity

In this tech blog, we explain selected Artificial Intelligence (AI) use cases in cybersecurity. But before we start we would like to mention some prerequisites and impacts for security architectures:

First, bearing Machine Learning basics in mind, we know that this approach demands a large amount of data, which needs to be provided so that AI-based cybersecurity analytics solutions can be used. The comparison with legacy and future security models and architectures shows that we need to increase the amount of (meta) data, which can be fed into an AI-based engine. Thus, simple firewall and network logs need to be extended and enriched with additional internal and external information.

From a network perspective, we need sensors which extract metadata from the raw data at scale. These sensors do not need to be restricted or rely on legacy signatures nor any particular detection method, but they need to extract metadata on application-level. The AI engine itself requires a flexible importer which is not limited to specific protocols and understands common formats like syslog. This flexible interface even provides the opportunity to add further data sources.

Secondly, we need to build a common and integrated security architecture with connected tools and network, client, server and cloud environments. This new security architecture is required in order to make use of AI and automate the detection and response processes.

Let us now turn to the selected use cases of AI in cybersecurity.

Machine Learning with Supervised Learning

Malware

A well-known example of ML and supervised learning – this means that we have to train the engine with predefined, labelled data – is a sandboxing solution to identify malware or malicious domains. The algorithm extracts relevant features, assigns importance to each and can then predict if the unknown input file is malware or benign, based on a percentage value as shown in the table.

Extracted Feature	Good documents	Malicious documents
Document name	0,9%	84,4%
No title	6,6%	50,2%
Obfuscated function calls	0,1%	39,6%
Accesses host file	21,8%	49,5%
DNS resolution	27,4%	50,4%
Excessive sleep calls	43,6%	67,1%

Once the model has been trained it can then predict if the unknown input file is malware or benign based on a percentage value.

Identify suspicious HTTP

Another example is the analysis of characteristics in HTTP headers to identify unique patterns of command-and-control behaviour that usually don’t exist in normal data traffic. As a prerequisite, data scientists and security engineers need to analyse a wide range of command-and-control traffic and focus on characteristics that are common across many types of malware. This information is fed into the learning algorithm, and a model is generated which can then predict if HTTP based command-and-control communication occurs. Instead of reactively trying to keep up with attackers when they change domains and IP addresses, this model quickly detects command-and-control communication without the use of signatures.

Machine Learning with Unsupervised Learning

Identify outliers – “Normal” and “Abnormal” behavior

The classic use cases of ML with unsupervised learning – which means that you don’t have to directly feed the algorithm with labelled data – are based on the principle of finding logical groupings in order to identify outliers from local norms. After a period of baselining, abnormal network traffic from a host can be an indicator of malicious activity. Secondly, identifying access to resources a user or host does not typically access could also show outliers from local norms. A third example is a behaviour pattern which is too regular for a human being. Here it is essential to note that an outlier does not necessarily equate to a security incident. Instead, it indicates the need to investigate and update the baseline.

Deep Learning

Malicious domain activity

As we already know, DL provides excellent opportunities in language translation. Most of the advanced online translators are using DL, and Google Translate is a prominent example in this area. Focusing on malicious domain activity, DL can be used to detect algorithmically-generated domains. These domains consist of random characters and are used by cyber attackers for command-and-control communication. They give attackers a smart approach to obfuscate the command and control infrastructure when malware tries to connect back home. DL inspects and classifies a given domain name whether it looks like if it has been generated by a machine or not. DL utilises its capability to learn and predict extremely useful regarding what will come next in a sequence of characters.

Threat detection model in SOC

The most significant use case, which is expected to be a fundamental approach for the future in the cybersecurity space, is the use of DL for the threat detection and investigation model in the Security Operation Center (SOC). Today’s security engineers are overwhelmed with incoming security events without context. This problem is exacerbated by the skill shortage in the security market and manual processes. Furthermore, detection models need to have long-term memory to retain the context of related activities over time, in order to identify slow-moving attacks.

DL brings the opportunity to solve all these problems and to increase the value of SOCs. With statistical linkage methods, DL can understand the relation between multiple events and can then provide automatic threat scoring for compromised hosts. DL models can also assign importance to behaviours that reflect strategic phases of the kill chain process. The use of DL considerably relieves the security team as the number of relevant events is reduced, correlation produced, and events assigned to respective steps of a kill chain process.

A radical thesis is that security models where metadata is extracted by network sensors supported by other system logs and enrichments like IOCs (Threat Intelligence) will substitute legacy approaches based on DPI. Based on this, DL engines will identify malicious activity accordingly. There are lots of advantages, especially for high scale environments, because only a smaller amount of data needs to be analysed. These new security models are more cost-effective than the scaling of DPI solutions, and will also solve the problem of identifying threats in the increasing amount of encrypted communication.