AI in your Network

- Posted in Uncategorized by

AI in your network

There's a scene in "Mars Attacks" where Pierce Brosnan, playing the scientist, dissects one of the dead Martians. He pulls some red jelly from the brain and says "Curious." It captures one of the problems with what we're calling AI today because like the components of an AI, though the jelly can do amazing things, you really can't look at it and say why.

The term Artificial Intelligence has changed its meaning many times over the last 50 years. It currently refers to systems that can do feature correlation and extraction from training data sets--often very large ones. For example, training a system to recognize a face (like your phone does) is an exercise in presenting exemplar data to the system along with reinforcing feedback when a face is dsiplayed. This is called supervised training. After seeing enough faces and getting the green light for each one, the system can learn a correlation and provide the green light on its own when a new face is presented.

enter image description here

The systems and theory for this kind of AI have been around since the 1960s--even the 1950s. A well-known example is called a multi-layer perceptron, or neural network. The original objective was to imitate the way neurons interconnect and to reinforce pathways in the presence of specific stimuli. The neural network would be made of two or often three layers--an input layer, a hidden layer and an output layer. Inputs to a given layer would add or subtract from one another in accordance with weights (multipliers). The weights would be "learned" during the training process. They were the jelly in the Martian's brain.

The neural network is an analog model of how neurons might work, and some analog implementations have been created. On a digital computer, however, it is represented by matrix arithmetic. Matrix arithmetic can often be parallelized so that multiple operations occur at the same time. This makes it fast—very fast--given suitable hardware.

In the last ten years, supercomputers have been teased up, not from liquid-cooled Crays or hypercubes, but from very small-featured computing devices like field-programmable gate arrays or graphics cards. In fact, Nvidia--the company that makes the some of the best graphics hardware--is also the world's leader in supercomputers. That's because Nvidia graphics cards are hyper-parallel, and they can be programmed to do AI just as well as to perform pixel-based ray tracing. Nvidia is currently building what will be the world's largest supercomputer in Britain.

In the network business, you see "AI" being applied to network analytics, intrusion detection and diagnostics. The systems are the product of supervised and unsupervised training that look to correlate events and to recognize unique patterns—such as a network intruder. Part of the reason why vendors are pushing the cloud so vigorously is in addition to being the customer, you are part of the product; your network experiences contribute the training sets for "AI" analytics. They need you to participate in the cloud, too.

Given copious compute power, the quest to make AIs more capable is correlated with making them deeper--adding more layers, more sophisticated back-propagation and weight adjustment. "Deep learning" makes the AI more powerful, but it also makes it subject to pitfalls that are endemic to higher order curve-fitting. That is, when the input is similar to training data, the results can be excellent. In the face of unfamiliar input, deep AI can be wildly unpredictable. "Curious," as Pierce Brosnan would say. And it’s coming to your network.

-Kevin Dowd

Why is Data Security Information so Noisy?

- Posted in Uncategorized by

Why is DATA SECURITY so noisy?

We’re always hoping for an easy score. But network traffic is the manifestation of intents; the traffic is there because someone or something has a goal. It might be exchanging email, sharing data or hacking you. In almost all cases, determining the goal by looking at the traffic requires a priori knowledge or assumptions about what the traffic means; it requires information that isn’t found in the traffic itself.

The object of a Security Event Manager (SEM) or an IDS/IPS is to derive knowledge from traffic data, and then reduce it to a score. The less knowledge in the process, the less valuable the score will be, which is the reason that administrators have to investigate false positives from network intelligence devices. Heuristics and signatures are two approaches to draw knowledge from data. Anomaly detection pulls patterns from data with little increase in knowledge.

Heuristics

Let’s say that you own a restaurant, and it has a security system. You get notified that the front door is open. What can you reasonably infer?

enter image description here

To make the simple observation that there might be a break-in in progress, we apply heuristics to data; we have derived knowledge. In the end, we can say that someone is probably breaking it.

Signatures

Signatures simplify event recognition for known patterns. They condense multivariate input—essentially the meat of heuristics—into a decision that creates the score. For the restaurant, a signature that said:

“the door is open and it is after hours”

would provide the same result as heuristics. Heuristics are more flexible, but signatures are efficient to process; they’re quick. Of all the possible ways to apply knowledge to complex events, signature recognition probably has the most going for it. But if a signature isn’t sufficiently specific, it can generate noise. Too specific and it might not fire.

Anomalies

Anomaly detectors watch patterns in traffic to see if they look different than training data. The more complex the input, the more complex the model, and the less sanguine its approximation will be when presented with a novel situation; higher order curve-fitting is prone to false positives by its nature.

enter image description here

The Semantic Gap

Semantic derivation is the process of increasing knowledge about the event. Semantic reduction is how we produce the score. When we combine anomaly detection, signatures and heuristics and semantically reduce them, the worst of the uncertainty in each comes out. This suggests that the more semantic reduction system that takes place in a SEM, the noisier the results will be.

What does provide reliable results? Simple metrics such as checksums on files and recognition of unplanned reboots unambiguously tell you something significant has happened, albeit late. A SEM will highlight activity you would have otherwise missed. But one can never eliminate the noise; there are semantic gaps between what is happening on the network, what the SEM understands, and the indication you receive on back-end; you’re much smarter than your network intelligence tools can ever be.

Reducing the Semantic Gap

Newer, AI-based anomaly detection systems (trained and untrained) improve the opportunities for nuanced event detection, and also for more false positives. Improvements come from coupling the output to intent, thereby reducing the semantic gap. The Mitre ATT&CK knowledge base provides a working framework for this approach. If one views the events that a SEM collects as the manifestations of intent, one can understand a breach for what it is. For instance:

  1. There is anomalous traffic from an internal computer
  2. The computer makes an outbound connection (command and control)
  3. The computer is probing the internal network; the outbound connection remains active

The numbered events, taken together, show a pattern of activity that predicts this machine has been compromised. Each one of them would be reason enough to wake the security guy in the middle of the night. Recognizing what they mean in combination reduces the semantic gap, and gives the security guy (or an IDS/IPS) a higher quality assessment of the circumstances.

Vectra

Vectra, a company out of California, provides a product that combines AI with higher-level intelligence to reduce the semantic gap. It rides on top of network taps, within cloud deployments and even inside Office365. Vectra’s Network Detection and Response platform, Cognito, has been demonstrating superior results in red team testing. Recognizing the importance of reducing the semantic gap, Atlantic is helping bring Vectra to its customers. Visit vectra.ai for more information on the products or contact Atlantic Computing (www.atlantic.com).

Copyright © 2021, Atlantic Computing Technology Corporation