The most important question to come from this article......
At what point does a boffin become an egghead?
Eggheads at MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) claim they have trained a machine-learning system to detect 85 per cent of network attacks. To reach that level, the software, dubbed AI2 [PDF], parsed billions of lines of log files, looking for behaviors that indicate either a malware infection …
Uhhhgg, yet another "AI" system demoed on the same data that it trained upon.
Pretty much like training on historical stock data, then demoing that the system would have had an astonishing return. All that you have done is prove that the model training worked.
Even if you get past the issue and can accurately predict trends that the model was not trained on, you still haven't proven you have something useful. When you deploy a financial model to actually trade on the predictions, it changes the market enough that the model is no longer predictive.
In this case I expect that the system is little more than a virus pattern scanner. It can accurately detect all of the obsolete attacks. And it will be able to detect new attacks as soon as they are old enough to be added to the database... at which point they are obsolete.
I'm not sure that testing on training data is too bad depending what is underlying the thing. However, ideally the test data is another set, rather than the original.
However: "it alerted a human analyst, who identified whether the software got it right or wrong"
I spend a fair amount of quality time with log files and HIDS/NIDS and it can be bloody hard to spot the signal in the noise. So this probably means that the logs have been marked up already and are a bit artificial or there is the possibility that the human gets it wrong as well. In the latter case the 85% hit rate had better grow some error bars. +/- 10% would be a good start, I think.
Ho hum, time to actually read the links. I'm a commentard: comment first before getting clued up
Right, I've skimmed the pdf. It is the real deal, sort of. They do not use the same data in the testing phase as they use for training. They do use a multi layer neural network thingie (it's been years since I messed around with perceptrons etc and it's been shuffled out of my head).
They only consider web server logs and only three threats (see 8.1).
AI? My arse! However I think it is a good start in the field. I suspect that once trained and running these things will be quite low cost, computationally speaking and they can learn from the human feedback whilst operating on operational data.
Remember kids, in this field there is no magic appliance which will simply make NIDS easy and hands off. It's fecking hard.
From the conclusion:
"as time progresses and feedback is collected, the detection rate shows an increasing trend, improving by 3.41× with respect to a state-of-the-art unsupervised anomaly detector, and reducing
false positives by more than 5×."
I believe this is after 28 days of operation.
As for usefulness - it sounds like a useful improvement, assuming the sample data is representative of "typical" traffic hitting a variety of common web servers, for a V1 product but its not going to fundamentally alter the security landscape.
I guess that every activity is an attack.
Bingo - 100% accuracy.
False Positive rate is not so good though!
As mentioned, the real problem is generating useful training datasets and labelling them accurately. For serious, targeted attacks, there is little prior experience to go on (as each attack is hand crafted for the victim). In my view, once the training data is good, the AI is easy.
Having looked at this, its incredibly heavy on human intervention. It has pre-set ideas of what a 'hacker invasion' should comprise (much like SIEM with 'if this and this and this' alarming, brilliant for detecting the last war).
And yes, even with differing test/train data you can still tune the system.
I'd like an explanation as to what it is classifying and why it is missing 15%? Is it to do with the PCA? If you do PCA, why do you need a neural network (apart from being sexy)? Why not use linear combination (at least you can see which dimension(s) are contributing).
My suspicion is the way they 'fuzz' the features and the classifier means that the performance cant get much better. You always need fine structure to get really good pattern matching, and they get shot of it.