Skip to content

isp1tze/datamining_malware_source

Repository files navigation

Data-trace-competition

Scripts in the file of analyse-tools

  1. dealShortUrl.py

reverse the most common short urls to the original urls

  1. shortUrl.py

reverse the rest short urls to the original urls

  1. getDomain.py

web scraping to get domain

  1. getDNSInfo.py

web scraping to get dns info

  1. getWhoisInfo.py

web scraping to get whois info

  1. simHash.py

compute the simHash of urls

  1. textDistance.py

calculate the text distances of urls based on the simHash

  1. shang.py

calculate the entropy of texts

  1. countKey.py

count the numbers of the keys of urls

Scripts in the root directory

---for trace1---

Use xgboost to train and predict the data

  1. trace1.py

---for trace2---

Use data mining to cluster the data

  1. ultis.py
  2. trace2-1-step.py
  3. trace2-2-step.py
  4. trace2-3-step.py
  5. trace2-4-step.py
  6. trace2-5-step.py

About

Code for data trace competition in GeekPwn 2018

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages