Skip to content

Latest commit

 

History

History
112 lines (83 loc) · 3.73 KB

README.adoc

File metadata and controls

112 lines (83 loc) · 3.73 KB

Isconna.Python

Test.Demo Status PyPI Version

Python porting of the Isconna algorithm.

For pip users:
For researchers:
  • Please consider using the C++ version as the baseline, this porting may not receive timely updates.

Table of Contents

Demo

  1. Open a terminal

  2. cd to the project root Isconna.Python

  3. If you already have a copy of datasets (e.g., from Isconna), you can set the environment variable DATASET_DIR to its data folder

    1. Otherwise, curl -OL https://github.com/liurui39660/Isconna/raw/master/data/data.zip

    2. mkdir data && tar -xf data.zip -C data (Windows)

      • Or unzip data.zip -d data (Linux/macOS)

      • Or 7z x data.zip -odata

      • You can see a directory like data/CIC-IDS2018/processed

  4. pip install -r requirements.txt

    • Or conda install --file requirements.txt -y

  5. set PYTHONPATH=src (Windows) or export PYTHONPATH=src (Linux/macOS)

  6. python example/Demo.py

This runs Isconna-EO on CIC-IDS2018 ($DATASET_DIR/CIC-IDS2018/processed/Data.csv) and prints ROC-AUC.

Requirement

All required packages are listed in requirements.txt.

Python 3.6+ should be fine.

Core
  • numba: JIT, i.e., acceleration

  • numpy: Make code concise, but no effect on speed

    • Because you actually run on the jitted (translated) code

Demo
  • pyprojroot: Detect project root path

  • scikit-learn: Metric

  • tqdm: Progress bar

Customization

Export Raw Scores

Uncomment the section "Export raw scores" of example/Demo.py.

out/Score.txt has 1 column: the final anomaly score.

Switch Cores

Cores are declared in the section "Do the magic" of example/Demo.py. Uncomment the desired core.

Different Parameters / Datasets

Parameters and dataset paths are specified in the section Parameter of example/Demo.py.

External Dataset + Demo.py

You need to prepare three files:

  • Meta file

    • Only includes an integer n, the number of records in the dataset

    • Assign its path to pathMeta

    • E.g., data/CIC-IDS2018/processed/Meta.txt

  • Data file

    • A header-less csv file with shape [n,3]

    • Each row includes 3 integers: source, destination and timestamp

    • Timestamps should start from 1 and be continuous

    • Assign its path to pathData

    • E.g., data/CIC-IDS2018/processed/Data.csv

  • Label file

    • A header-less text file with shape [n,1]

    • Each row includes 1 integer: 0 if normal, 1 if anomalous

    • Assign its path to pathLabel

    • E.g., data/CIC-IDS2018/processed/Label.csv

External Dataset + Custom Runner

  1. Copy the directory src/Isconna to where you need

    • Pip users can skip this step, it’s already installed

  2. Import Isconna in the code

  3. Instantiate cores with required parameters

    • Number of CMS rows

    • Number of CMS columns

    • Decay factor (default is 0, i.e., keep nothing)

  4. Call Call() on individual records, the signature includes

    1. Source (categorical)

    2. Destination (categorical)

    3. Timestamp

    4. Weight for the frequency score

    5. Weight for the width score

    6. Weight for the gap score

    7. Return value is the anomaly score

Feedback

If you have any suggestion, can’t understand the algorithm, don’t know how to use the experiment code, etc., please feel free to open an issue.