Skip to content

Using Machine Learning for Malware Traffic Classification

Notifications You must be signed in to change notification settings

skumailraza/tls_malware_detection

Repository files navigation

Python code to automate the execution of the clustering process.

Step 1: Parse pcaps/bro logs into a JSON file

  • When parsing pcaps, run the pcap_processing/parse_bro_logs_v2.py script.
  • When parsing bro logs, run the pcap_processing/parse_logs.py script.
  • The output consist in the JSON file and a folder with the certificates.

Step 2: Extract features

  • Run the clustering/prep/extract_features.py script, specifying the corresponding avclass and virustotal files, if any, and the folder with the certs. The TLS fingerprints can be located at clustering/data/tls_fprints/
  • The output is a TSV file with the feature vectors.

Step 3: Run the clustering

  • When using the script clustering/run_clustering.py, a dataset_name must be specified. The script will look for two files into the data/groundtruth folder by default, the feature vectors (dataset_name.tsv) and the related avclass file (dataset_name.labels)
  • The output will be generated inside a folder with the same name as the dataset_name.

About

Using Machine Learning for Malware Traffic Classification

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published