Skip to content

Commit

Permalink
Cleanup and prepare for v1.0 release
Browse files Browse the repository at this point in the history
  • Loading branch information
ex0dus-0x committed Jan 8, 2022
1 parent d8c7f47 commit 07c87a1
Show file tree
Hide file tree
Showing 3 changed files with 26 additions and 8 deletions.
21 changes: 19 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ In the past, __Fork Sentry__ has already found and taken down instances of:

(TODO: include writeups, and links to paper releases)

## Actions Usage
## Usage

__Fork Sentry__ operates out of a seperate cloud infrastructure, which you can self-host with our open-sourced code, or reach out for an API token (WIP) to the existing one. This way we're able to scale analysis to large volumes of forks, while outsourcing scheduling to Action's CI/CD runner.

Expand Down Expand Up @@ -44,6 +44,10 @@ jobs:
## Architecture
![infrastructure](infrastructure.png)
For more information about self-hosting, check out the spec here.
### Dispatcher
The Golang dispatcher ingests authenticated requests for analysis of a target parent repository. The request can
Expand All @@ -53,10 +57,23 @@ be invoked adhoc similarly like so:
$ curl -X POST -d '{"owner":"OWNER", "name": "NAME", "github_token": "ghp_TOKEN", "api_token": "API_TOKEN"}' -H 'Content-Type: application/json' https://endpoint.example/dispatch
```
or preferably through the Actions runner itself, which can be put on a schedule. The __dispatcher__ extracts all forks and enqueues them for the analyzer.
or preferably through the Actions runner itself, which can be put on a schedule. The __dispatcher__ extracts all forks and publishes each for analyzers to subscribe and
consume.
### Analyzer
For an individual fork, we check the following:
* Name typosquatting
* Known malware signatures
* Suspicious capabilities
Previously detected samples are also checked using their _locality-sensitive hashes_ against a database with [this technique](https://www.virusbulletin.com/virusbulletin/2015/11/optimizing-ssdeep-use-scale).
### Alert Function
Potentially malicious forks are written back to the issue tracker in this step.
## License
Fork Sentry is release under a Apache License 2.0 License
13 changes: 7 additions & 6 deletions analyzer/repo_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,10 @@

class RepoAnalysis:
"""
Implements an interface for conducting fork integrity analysis across a single repository.
Implements an interface for conducting fork integrity analysis across a single fork repository.
During analysis, recovers all interesting filetypes from both repository tree and branches, and
applies malware detection techniques.
"""

def __init__(
Expand All @@ -67,9 +70,9 @@ def __init__(
vt_token: t.Optional[str] = None,
):
self.gh = Github(token)
self.token = token

# fork attributes
self.token = token
self.repo = self.gh.get_repo(repo_name)
self.uuid = "".join(
random.choice(string.ascii_uppercase + string.digits) for _ in range(6)
Expand Down Expand Up @@ -143,7 +146,7 @@ def _analyze_artifact(self, path) -> t.Optional[str]:
# trigger ClamAV scan first
results = scanner.instream(iobuf)
for path, tags in results.items():
found, name = tags[0], tags[1]
found, name = tags[0], tags[1]
if found == "FOUND":
iocs += [f"clamav:{name}"]

Expand Down Expand Up @@ -174,11 +177,9 @@ def _detect_sims(self, path: str):
fhash = ssdeep.hash(fd.read())

# recover attributes from fuzzy hash
chunksize, chunk, double_chunk = fhash.split(':')
chunksize, chunk, double_chunk = fhash.split(":")
chunksize = int(chunksize)



def detect_suspicious(self):
"""
Analyze an individual fork repository and detect suspicious artifacts and releases.
Expand Down
Binary file added infrastructure.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 07c87a1

Please sign in to comment.