Introduction to machine learning for security analysts
Slides: https://www.slideshare.net/GTKlondike/machine-learning-for-security-analysts-149291369
This workshop is intented to be interactive. Checkout the Google Colab links below to work with the code for this workshop:
-
Spam filter using Scikit-Learn Workbook https://colab.research.google.com/drive/1CA82qL46XIGhkw0eOi3c0whNTvwaXwZy
- Workbook Answers - https://colab.research.google.com/drive/17ABiBU43E9RVIN2pN98W6U2Fieq6P24C
-
Malicious URL predictor Workbook https://colab.research.google.com/drive/1FMWMdHsj8UPXtcb7rOmGK5VnnMUndEJV
- Workbook Answers - https://colab.research.google.com/drive/1ghSk9F-Cz_A0B5M0LqUNi2atSZUp-c2q
-
Spam filter using Naive Bayes Workbook https://colab.research.google.com/drive/1Lo50HKGLSNDoJWITDGJtPSrGosRqTi_3
- Workbook Answers - https://colab.research.google.com/drive/1DuNHY65n9v3Mi11A57N5HaJTzMAFIq1i
The narrative across the first two demos is to build and evaluate machine learning models using the techniques described in the presentation. By building a spam filter, we will demonstrate how abstraction libraries like Scikit-Learn makes building and training models even easier by showing the plug-and-play of nature of the library.
Then, we will use the exact same techniques to build a malicious URL predictor.
Finally, we will use the equations from the slides to rebuild a Multinomial Naive Bayes spam filter, but this time without the help of an abstraction library.
The added benefit of having these demos on Google Colab is that it allows people to take the code home and look at what it's doing, in an interactive browser session. Alternatively, this Github repository may be used with https://mybinder.org to interact with the notebooks in a live environment.