gfuzztools
is a collection of tools designed to facilitate working with BNF grammars in the C programming language. It is the low-level implementation of a Python library written by Dr Rahul Gopinath.
- Optimized Grammar Representation: Define grammars in C using an 8-bit representation, offering a performance boost over traditional string-based approaches.
- JSON Converter: Seamlessly convert existing JSON representations into our efficient C representation.
- Fuzzing Functions: Generate strings from defined grammars for comprehensive testing and validation.
- String Extraction: Extract all strings from a grammar or retrieve a specific string at a given index.
- Speed: Preliminary testing indicates a 20x speedup compared to the Python implementation.
- UAR sampling: Sample a string of a certain length from a grammar uniformly at random.
This program adheres to the C99 standard, utilizing gcc
for compilation. Clone the repository and run any of the examples using:
make [example, counts, strings, at]
Refer to our Developer Guide for an in-depth explanation of the codebase and how to utilise the gfuzztools
library effectively.
Formal grammars are often overlooked in the initial stages of programming language development, resulting in a disconnect between the intended grammar and the actual code. Grammar inference, both white-box and black-box, plays a crucial role in understanding a language's grammar. State-of-the-art grammar inference algorithms include Angluin's L* algorithm and the TTT algorithm.
The original motivation of gfuzztools
was to set up a toolchain to compare the effectiveness of grammar inference algorithms. This comparison relies on two empirical values:
- Precision: Measures how close the output of the grammar inference algorithm is to running a parser on the original grammar.
- Recall: Measures how close the output of running a generating fuzzer on the original grammar is to running a parser on the inferred grammar.
We can then combine these two values using the
However, it is key that the generating fuzzer is sampling strings uniformly at random (UAR), otherwise the measure will not be accurate. Hence, the first milestone in gfuzztools
was to produce a UAR sampler.
gfuzztools
is a work-in-progress. There are several improvements and additions yet to come, including:
- Work in the field of grammar inference, including:
- Implementing Angluin's L* algorithm in C
See the open issues for a full list of proposed features and known issues.
All contributions are greatly appreciated! If you have a suggestion that would make this project better, please fork the repo and create a pull request.
- Fork the Project
- Create your Feature Branch (git checkout -b feature/branch-FeatureName)
- Commit your Changes (git commit -m 'Add some FeatureName')
- Push to the Branch (git push origin feature/AmazingFeature)
- Open a Pull Request
Distributed under the MIT License. See LICENSE.txt
for more information.
This project was created as part of a summer research program at the University of Sydney.
See below for contact details.
Supervisor | Dr. Rahul Gopinath | [email protected] |
---|---|---|
Student | Antriksh Dhand | [email protected] |