This is the artifact of Derivative-Guided Symbolic Execution under submission to POPL 2025. The derivative-guided symbolic execution engine HATch, along with its derivative-free variant, are provided. Please follow the instructions below for installation and reproduction of experiment results.

Getting Started

We recommend the use of machines with at least 8 GB memory – HATch raises errors when memory usage exceeds 8GB. All experiments were performed on AMD Ryzen 7 PRO 6850U with 32 GB RAM. The artifact is provided as a Docker image. Before proceeding, try docker run hello-world to ensure that Docker works as intended.

You may either load the pre-built docker image:

docker load < ../hatch.tar.gz

or build the docker image on your own:

docker build . --tag hatch

Then start a shell environment in the docker image:

docker run -i --rm -t hatch

Lastly, compile HATch and find the executable at `_build/default/bin/main.exe`:

dune build --profile release

Falsify Individual ADT implementations

Directory data/ri stores various benchmarks in its subfolders, each corresponding to an abstract data type (ADT) and its underlying stateful representation type. Consider the implementation of Stack using LinkedList, the sub-directory data/ri/Stack_LinkedList contains:

the auxiliary definitions of common symbolic regular expressions in automata_preds.ml;
the specification of LinkedList’s methods in lib_rty.ml;
the safety property represented by a symbolic regular expression in ri.ml, and;
the definition of Stack’s methods, e.g., cons.ml and concat_aux.ml, and their respective buggy variant (_buggy postfix), with accompanying specification that manifest the safety property.

Now we have HATch falsify the buggy implementation of data/ri/Stack_LinkedList/cons (in place of ${bench} below):

timeout -v 1m _build/default/bin/main.exe symb-exec meta-config.json data/ri/${bench}_buggy.ml -empty-aware -exec-bound 20 -eff-append-bound 5

HATch falsifies the implementation by finding an execution path with

Phi : the path condition collected along the execution path
Trace : the trace of symbolic events that witness the execution path
R_cont : a symbolic regular expression accepting traces that the execution can produce without violating the safety property
expr : the term under execution

In this case, Phi and Trace together are satisfiable – the associated state is reachable – and R_cont is empty. Therefore, HATch determines that the execution path is falsified.

The derivative-free variant of HATch can also falsify the buggy implementation, though it takes longer:

timeout -v 1m _build/default/bin/main.exe symb-exec meta-config.json data/ri/${bench}_buggy.ml -no-deriv -exec-bound 20

We regard prior work on type checking automata-augmented refinement type as the baseline verifier:

timeout -v 1m _build/default/bin/main.exe ri-type-check meta-config.json data/ri/${bench}_buggy.ml

Compared to HATch, the verifier also takes longer time to falsify the same buggy implementation:

See Section Experiments for the complete list of methods associated with each pair of ADT and its representation type.

Post-Processing Scripts

Given the output stored in out, we use the following scripts (in Emacs Lisp) to extract relevant information.

(let ((last-line (car (last (split-string out "\n" t)))))
  (cond
   ((string-match "timeout" last-line) (princ "T/O"))
   ((string-match "Out of memory" last-line) (princ "O/M"))
   ((string-match str last-line)
    (let ((start (string-match "exec time" out))
          (end (string-match "(s)" out)))
      (princ (format "%.2f" (string-to-number (substring out (+ start 10) end))))))
   (t (princ "N/A")))
  (terpri))

We use the following script to compute the speedup of HATch.

(if (string-match-p "^[0-9]+\\.?[0-9]*$" x) (format "\\times%.1f" (/ (string-to-number x) (string-to-number y))) x)

Experiments

The following table lists all methods whose buggy implementation are falsified by HATch (first column). The next three columns are also illustrated in Table 1 of the paper while the last two columns are omitted.

	time(s)	time(s)	Naive	time(s)	Verifier
<l>	<r>	<r>	<r>	<r>	<r>
ADT_ReprType/method	HATch	Naive	SpeedUp/	Verifier	SpeedUp/
Stack_LinkedList/cons	0.29	0.91	×3.1	1.03	×3.6
Stack_LinkedList/concat_aux	0.17	7.40	×43.5	2.08	×12.2
Stack_KVStore/cons	0.76	10.87	×14.3	3.20	×4.2
Stack_KVStore/concat_aux	0.58	T/O	T/O	3.85	×6.6
Queue_LinkedList/append	0.49	1.21	×2.5	1.25	×2.6
Queue_Graph/append	1.11	14.45	×13.0	8.15	×7.3
Set_KVStore/insert	0.56	24.97	×44.6	0.79	×1.4
Set_Tree/insert_aux	0.70	3.33	×4.8	7.75	×11.1
Heap_LinkedList/insert_aux	0.07	0.90	×12.9	0.92	×13.1
Heap_Tree/insert_aux	0.64	1.52	×2.4	1.58	×2.5
MinSet_Set/minset_singleton	0.72	0.91	×1.3	0.92	×1.3
MinSet_Set/minset_insert	0.82	8.09	×9.9	7.89	×9.6
MinSet_KVStore/minset_singleton	0.42	7.53	×17.9	1.78	×4.2
MinSet_KVStore/minset_insert	1.17	13.23	×11.3	18.22	×15.6
LazySet_Tree/insert_aux	0.70	3.32	×4.7	7.82	×11.2
LazySet_Set/lazy_insert	0.31	0.40	×1.3	0.42	×1.4
LazySet_KVStore/insert_aux	0.56	25.37	×45.3	0.79	×1.4
DFA_KVStore/add_transition	0.41	12.59	×30.7	12.59	×30.7
DFA_KVStore/del_transition	0.66	9.95	×15.1	9.93	×15.0
DFA_Graph/add_transition	0.64	8.00	×12.5	8.75	×13.7
DFA_Graph/del_transition	1.19	11.63	×9.8	10.28	×8.6
ConnectedGraph_Set/singleton	0.18	54.51	×302.8	2.82	×15.7
ConnectedGraph_Set/add_node	0.83	O/M	O/M	8.31	×10.0
ConnectedGraph_Set/add_transition	0.91	O/M	O/M	10.29	×11.3
ConnectedGraph_Graph/singleton	0.72	1.28	×1.8	1.25	×1.7
ConnectedGraph_Graph/add_node	1.25	7.65	×6.1	7.61	×6.1
ConnectedGraph_Graph/add_transition	1.50	23.93	×16.0	31.09	×20.7
ColoredGraph_Graph/add_edge	2.31	T/O	T/O	T/O	T/O
ColoredGraph_KVStore/add_edge	4.88	T/O	T/O	T/O	T/O
LinkedList_KVStore/remove_aux	4.98	T/O	T/O	N/A	N/A

The data in the table is produced using org-babel shipped with Emacs. With a recent installation of Emacs (for scripting) and pandoc (for conversion to markdown), one may rerun all experiments (takes about 20 minutes on AMD Ryzen 7 PRO 6850U) and regenerate this file, including the above table, using the following shell commands.

 emacs --batch -l ob -l ob-shell --eval "
   (let ((org-confirm-babel-evaluate nil))
     (dolist (file command-line-args-left)
	(with-current-buffer (find-file-noselect file)
	  (org-table-recalculate-buffer-tables)
	  (save-buffer))))
 " README.org
 pandoc -s README.org --to=gfm -o README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.org

README.org

Getting Started

Falsify Individual ADT implementations

Post-Processing Scripts

Experiments

Files

README.org

Latest commit

History

README.org

File metadata and controls

Getting Started

Falsify Individual ADT implementations

Post-Processing Scripts

Experiments