Skip to content

antrikshdhand/gfuzztools

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

gfuzztools

Tools for working with grammars written in C.
Developer Guide · Issues · PRs

gfuzztools is a collection of tools designed to facilitate working with BNF grammars in the C programming language. It is the low-level implementation of a Python library written by Dr Rahul Gopinath.

Key Features

  • Optimized Grammar Representation: Define grammars in C using an 8-bit representation, offering a performance boost over traditional string-based approaches.
  • JSON Converter: Seamlessly convert existing JSON representations into our efficient C representation.
  • Fuzzing Functions: Generate strings from defined grammars for comprehensive testing and validation.
  • String Extraction: Extract all strings from a grammar or retrieve a specific string at a given index.
  • Speed: Preliminary testing indicates a 20x speedup compared to the Python implementation.
  • UAR sampling: Sample a string of a certain length from a grammar uniformly at random.

C Python

Table of Contents

Getting started

This program adheres to the C99 standard, utilizing gcc for compilation. Clone the repository and run any of the examples using:

make [example, counts, strings, at]

Usage

Refer to our Developer Guide for an in-depth explanation of the codebase and how to utilise the gfuzztools library effectively.

Roadmap

Motivation

Formal grammars are often overlooked in the initial stages of programming language development, resulting in a disconnect between the intended grammar and the actual code. Grammar inference, both white-box and black-box, plays a crucial role in understanding a language's grammar. State-of-the-art grammar inference algorithms include Angluin's L* algorithm and the TTT algorithm.

The original motivation of gfuzztools was to set up a toolchain to compare the effectiveness of grammar inference algorithms. This comparison relies on two empirical values:

  • Precision: Measures how close the output of the grammar inference algorithm is to running a parser on the original grammar.
  • Recall: Measures how close the output of running a generating fuzzer on the original grammar is to running a parser on the inferred grammar.

We can then combine these two values using the $\text{F1}$ score: $$\text{F1 score} = \frac{2 \times \text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}$$

However, it is key that the generating fuzzer is sampling strings uniformly at random (UAR), otherwise the measure will not be accurate. Hence, the first milestone in gfuzztools was to produce a UAR sampler.

Future work

gfuzztools is a work-in-progress. There are several improvements and additions yet to come, including:

  • Work in the field of grammar inference, including:
    • Implementing Angluin's L* algorithm in C

See the open issues for a full list of proposed features and known issues.

Contributions

Contributors Issues PRs

All contributions are greatly appreciated! If you have a suggestion that would make this project better, please fork the repo and create a pull request.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/branch-FeatureName)
  3. Commit your Changes (git commit -m 'Add some FeatureName')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

Licence

Distributed under the MIT License. See LICENSE.txt for more information.

Contact

This project was created as part of a summer research program at the University of Sydney.

See below for contact details.

Supervisor Dr. Rahul Gopinath [email protected]
Student Antriksh Dhand [email protected]

About

Tools for working with grammars in C

Resources

License

Stars

Watchers

Forks

Releases

No releases published