This is the artifact of Derivative-Guided Symbolic Execution under submission to POPL 2025. The derivative-guided symbolic execution engine HATch, along with its derivative-free variant, are provided. Please follow the instructions below for installation and reproduction of experiment results.
We recommend the use of machines with at least 8 GB memory – HATch
raises errors when memory usage exceeds 8GB. All experiments were
performed on AMD Ryzen 7 PRO 6850U with 32 GB RAM. The artifact is
provided as a Docker image. Before proceeding, try docker run
hello-world
to ensure that Docker works as intended.
You may either load the pre-built docker image:
docker load < ../hatch.tar.gz
or build the docker image on your own:
docker build . --tag hatch
Then start a shell environment in the docker image:
docker run -i --rm -t hatch
Lastly, compile HATch and find the executable at `_build/default/bin/main.exe`:
dune build --profile release
Directory data/ri
stores various benchmarks in its subfolders, each
corresponding to an abstract data type (ADT) and its underlying
stateful representation type. Consider the implementation of Stack
using LinkedList
, the sub-directory data/ri/Stack_LinkedList
contains:
- the auxiliary definitions of common symbolic regular expressions in
automata_preds.ml
; - the specification of
LinkedList
’s methods inlib_rty.ml
; - the safety property represented by a symbolic regular expression in
ri.ml
, and; - the definition of
Stack
’s methods, e.g.,cons.ml
andconcat_aux.ml
, and their respective buggy variant (_buggy
postfix), with accompanying specification that manifest the safety property.
Now we have HATch falsify the buggy implementation of
data/ri/Stack_LinkedList/cons
(in place of ${bench}
below):
timeout -v 1m _build/default/bin/main.exe symb-exec meta-config.json data/ri/${bench}_buggy.ml -empty-aware -exec-bound 20 -eff-append-bound 5
HATch falsifies the implementation by finding an execution path with
Phi
: the path condition collected along the execution pathTrace
: the trace of symbolic events that witness the execution pathR_cont
: a symbolic regular expression accepting traces that the execution can produce without violating the safety propertyexpr
: the term under execution
In this case, Phi
and Trace
together are satisfiable – the associated
state is reachable – and R_cont
is empty. Therefore, HATch
determines that the execution path is falsified.
The derivative-free variant of HATch
can also falsify the buggy
implementation, though it takes longer:
timeout -v 1m _build/default/bin/main.exe symb-exec meta-config.json data/ri/${bench}_buggy.ml -no-deriv -exec-bound 20
We regard prior work on type checking automata-augmented refinement type as the baseline verifier:
timeout -v 1m _build/default/bin/main.exe ri-type-check meta-config.json data/ri/${bench}_buggy.ml
Compared to HATch, the verifier also takes longer time to falsify the same buggy implementation:
See Section Experiments for the complete list of methods associated with each pair of ADT and its representation type.
Given the output stored in out
, we use the following scripts (in Emacs
Lisp) to extract relevant information.
(let ((last-line (car (last (split-string out "\n" t)))))
(cond
((string-match "timeout" last-line) (princ "T/O"))
((string-match "Out of memory" last-line) (princ "O/M"))
((string-match str last-line)
(let ((start (string-match "exec time" out))
(end (string-match "(s)" out)))
(princ (format "%.2f" (string-to-number (substring out (+ start 10) end))))))
(t (princ "N/A")))
(terpri))
We use the following script to compute the speedup of HATch.
(if (string-match-p "^[0-9]+\\.?[0-9]*$" x) (format "\\times%.1f" (/ (string-to-number x) (string-to-number y))) x)
The following table lists all methods whose buggy implementation are falsified by HATch (first column). The next three columns are also illustrated in Table 1 of the paper while the last two columns are omitted.
<l> | <r> | <r> | <r> | <r> | <r> |
ADT_ReprType/method | HATch | Naive | SpeedUp/ | Verifier | SpeedUp/ |
time(s) | time(s) | Naive | time(s) | Verifier | |
---|---|---|---|---|---|
Stack_LinkedList/cons | 0.29 | 0.91 | ×3.1 | 1.03 | ×3.6 |
Stack_LinkedList/concat_aux | 0.17 | 7.40 | ×43.5 | 2.08 | ×12.2 |
Stack_KVStore/cons | 0.76 | 10.87 | ×14.3 | 3.20 | ×4.2 |
Stack_KVStore/concat_aux | 0.58 | T/O | T/O | 3.85 | ×6.6 |
Queue_LinkedList/append | 0.49 | 1.21 | ×2.5 | 1.25 | ×2.6 |
Queue_Graph/append | 1.11 | 14.45 | ×13.0 | 8.15 | ×7.3 |
Set_KVStore/insert | 0.56 | 24.97 | ×44.6 | 0.79 | ×1.4 |
Set_Tree/insert_aux | 0.70 | 3.33 | ×4.8 | 7.75 | ×11.1 |
Heap_LinkedList/insert_aux | 0.07 | 0.90 | ×12.9 | 0.92 | ×13.1 |
Heap_Tree/insert_aux | 0.64 | 1.52 | ×2.4 | 1.58 | ×2.5 |
MinSet_Set/minset_singleton | 0.72 | 0.91 | ×1.3 | 0.92 | ×1.3 |
MinSet_Set/minset_insert | 0.82 | 8.09 | ×9.9 | 7.89 | ×9.6 |
MinSet_KVStore/minset_singleton | 0.42 | 7.53 | ×17.9 | 1.78 | ×4.2 |
MinSet_KVStore/minset_insert | 1.17 | 13.23 | ×11.3 | 18.22 | ×15.6 |
LazySet_Tree/insert_aux | 0.70 | 3.32 | ×4.7 | 7.82 | ×11.2 |
LazySet_Set/lazy_insert | 0.31 | 0.40 | ×1.3 | 0.42 | ×1.4 |
LazySet_KVStore/insert_aux | 0.56 | 25.37 | ×45.3 | 0.79 | ×1.4 |
DFA_KVStore/add_transition | 0.41 | 12.59 | ×30.7 | 12.59 | ×30.7 |
DFA_KVStore/del_transition | 0.66 | 9.95 | ×15.1 | 9.93 | ×15.0 |
DFA_Graph/add_transition | 0.64 | 8.00 | ×12.5 | 8.75 | ×13.7 |
DFA_Graph/del_transition | 1.19 | 11.63 | ×9.8 | 10.28 | ×8.6 |
ConnectedGraph_Set/singleton | 0.18 | 54.51 | ×302.8 | 2.82 | ×15.7 |
ConnectedGraph_Set/add_node | 0.83 | O/M | O/M | 8.31 | ×10.0 |
ConnectedGraph_Set/add_transition | 0.91 | O/M | O/M | 10.29 | ×11.3 |
ConnectedGraph_Graph/singleton | 0.72 | 1.28 | ×1.8 | 1.25 | ×1.7 |
ConnectedGraph_Graph/add_node | 1.25 | 7.65 | ×6.1 | 7.61 | ×6.1 |
ConnectedGraph_Graph/add_transition | 1.50 | 23.93 | ×16.0 | 31.09 | ×20.7 |
ColoredGraph_Graph/add_edge | 2.31 | T/O | T/O | T/O | T/O |
ColoredGraph_KVStore/add_edge | 4.88 | T/O | T/O | T/O | T/O |
LinkedList_KVStore/remove_aux | 4.98 | T/O | T/O | N/A | N/A |
The data in the table is produced using org-babel shipped with Emacs. With a recent installation of Emacs (for scripting) and pandoc (for conversion to markdown), one may rerun all experiments (takes about 20 minutes on AMD Ryzen 7 PRO 6850U) and regenerate this file, including the above table, using the following shell commands.
emacs --batch -l ob -l ob-shell --eval "
(let ((org-confirm-babel-evaluate nil))
(dolist (file command-line-args-left)
(with-current-buffer (find-file-noselect file)
(org-table-recalculate-buffer-tables)
(save-buffer))))
" README.org
pandoc -s README.org --to=gfm -o README.md