Getting Started

This is the artifact of Derivative-Guided Symbolic Execution under submission to POPL 2025. The derivative-guided symbolic execution engine HATch, along with its derivative-free variant, are provided. Please follow the instructions below for installation and reproduction of experiment results.

Getting Started

We recommend the use of machines with at least 8 GB memory – HATch raises errors when memory usage exceeds 8GB. All experiments were performed on AMD Ryzen 7 PRO 6850U with 32 GB RAM. The artifact is provided as a Docker image. Before proceeding, try docker run hello-world to ensure that Docker works as intended.

You may either load the pre-built docker image:

docker load < ../hatch.tar.gz

or build the docker image on your own:

docker build . --tag hatch

Then start a shell environment in the docker image:

docker run -i --rm -t hatch

Lastly, compile HATch and find the executable at `_build/default/bin/main.exe`:

dune build --profile release

Falsify Individual ADT implementations

Directory data/ri stores various benchmarks in its subfolders, each corresponding to an abstract data type (ADT) and its underlying stateful representation type. Consider the implementation of Stack using LinkedList, the sub-directory data/ri/Stack_LinkedList contains:

the auxiliary definitions of common symbolic regular expressions in automata_preds.ml;
the specification of LinkedList's methods in lib_rty.ml;
the safety property represented by a symbolic regular expression in ri.ml, and;
the definition of Stack's methods, e.g., cons.ml and concat_aux.ml, and their respective buggy variant (_buggy postfix), with accompanying specification that manifest the safety property.

Now we have HATch falsify the buggy implementation of data/ri/Stack_LinkedList/cons (in place of ${bench} below):

timeout -v 1m _build/default/bin/main.exe symb-exec meta-config.json data/ri/${bench}_buggy.ml -empty-aware -exec-bound 20 -eff-append-bound 5

HATch falsifies the implementation by finding an execution path with

Phi : the path condition collected along the execution path
Trace : the trace of symbolic events that witness the execution path
R_cont : a symbolic regular expression accepting traces that the execution can produce without violating the safety property
expr : the term under execution

Preemptive Hatch
Phi    = x:{v:Elem.t | true}, s:{v:Cell.t | true}, x_30_2:{v:bool | ¬v}, _x_5:{v:Cell.t | true}, c_2:{v:Cell.t | v == _x_5}, x_31_2:{v:bool | v <=> c_2 == s}, x_32_2:{v:bool | (v <=> ¬x_31_2) ∧ v}, x_30_11:{v:bool | ¬v}, _x_14:{v:Cell.t | true}, c_11:{v:Cell.t | v == _x_14}, x_31_11:{v:bool | v <=> c_11 == s}, x_32_11:{v:bool | (v <=> ¬x_31_11) ∧ ¬v}
Trace  = ⟨hasPrev x_0 = vret | ¬vret⟩; ⟨newCell x_0 = vret | vret == _x_5 ∧ ¬vret == c_11⟩; ⟨hasPrev x_0 = vret | ¬vret⟩; ⟨newCell x_0 = vret | vret == _x_14 ∧ vret == c_11⟩; ⟨putCellContent x_0 x_1 = vret | x_0 == c_11 ∧ x_1 == x⟩; ⟨setNext x_0 x_1 = vret | x_0 == c_11 ∧ x_1 == s ∧ x_0 == x_1⟩
R_cont = ∅
expr   =
(let (f_1 : Cell.t) =
   (let (c'_2 : Cell.t) = (c_11 : Cell.t) in (c'_2 : Cell.t) : Cell.t) in
 (f_1 : Cell.t) : Cell.t)
DT(Stack)  Task 1(cons): exec time 0.519420(s), check failed

In this case, Phi and Trace together are satisfiable – the associated state is reachable – and R_cont is empty. Therefore, HATch determines that the execution path is falsified.

The derivative-free variant of HATch can also falsify the buggy implementation, though it takes longer:

timeout -v 1m _build/default/bin/main.exe symb-exec meta-config.json data/ri/${bench}_buggy.ml -no-deriv -exec-bound 20

Terminated Hatch
Phi    = x:{v:Elem.t | true}, s:{v:Cell.t | true}, x_30_2:{v:bool | ¬v}, _x_5:{v:Cell.t | true}, c_2:{v:Cell.t | v == _x_5}, x_31_2:{v:bool | v <=> c_2 == s}, x_32_2:{v:bool | (v <=> ¬x_31_2) ∧ ¬v}
R_curr = ((((((((.\⟨setNext x_0 x_1 = vret | x_0 == x_1⟩)* ∧ (.*;(⟨setNext x_0 x_1 = vret | x_1 == s⟩;.*))ᶜ);⟨hasPrev x_0 = vret | ¬vret⟩) ∧ .*);⟨newCell x_0 = vret | vret == _x_5⟩) ∧ (.*;(⟨newCell x_0 = vret | vret == c_2⟩;.*)));⟨putCellContent x_0 x_1 = vret | x_0 == c_2 ∧ x_1 == x⟩) ∧ .*);⟨setNext x_0 x_1 = vret | x_0 == c_2 ∧ x_1 == s⟩
expr   =
(c_2 : Cell.t)
DT(Stack)  Task 1(cons): exec time 1.592408(s), check failed

We regard prior work on type checking automata-augmented refinement type as the baseline verifier:

timeout -v 1m _build/default/bin/main.exe ri-type-check meta-config.json data/ri/${bench}_buggy.ml

Compared to HATch, the verifier also takes longer time to falsify the same buggy implementation:

DT(Stack)  Task 1(cons): exec time 1.556991(s), check failed

See Section Experiments for the complete list of methods associated with each pair of ADT and its representation type.

Experiments

The following table lists all methods whose buggy implementation are falsified by HATch (first column). The next three columns are also illustrated in Table 1 of the paper while the last two columns are omitted.


ADT_ReprType/method	HATch	Naive	SpeedUp/	Verifier	SpeedUp/
	time(s)	time(s)	Naive	time(s)	Verifier
Stack_LinkedList/cons	0.29	0.91	× 3.1	1.03	× 3.6
Stack_LinkedList/concat_aux	0.17	7.40	× 43.5	2.08	× 12.2
Stack_KVStore/cons	0.76	10.87	× 14.3	3.20	× 4.2
Stack_KVStore/concat_aux	0.58	T/O	T/O	3.85	× 6.6
Queue_LinkedList/append	0.49	1.21	× 2.5	1.25	× 2.6
Queue_Graph/append	1.11	14.45	× 13.0	8.15	× 7.3
Set_KVStore/insert	0.56	24.97	× 44.6	0.79	× 1.4
Set_Tree/insert_aux	0.70	3.33	× 4.8	7.75	× 11.1
Heap_LinkedList/insert_aux	0.07	0.90	× 12.9	0.92	× 13.1
Heap_Tree/insert_aux	0.64	1.52	× 2.4	1.58	× 2.5
MinSet_Set/minset_singleton	0.72	0.91	× 1.3	0.92	× 1.3
MinSet_Set/minset_insert	0.82	8.09	× 9.9	7.89	× 9.6
MinSet_KVStore/minset_singleton	0.42	7.53	× 17.9	1.78	× 4.2
MinSet_KVStore/minset_insert	1.17	13.23	× 11.3	18.22	× 15.6
LazySet_Tree/insert_aux	0.70	3.32	× 4.7	7.82	× 11.2
LazySet_Set/lazy_insert	0.31	0.40	× 1.3	0.42	× 1.4
LazySet_KVStore/insert_aux	0.56	25.37	× 45.3	0.79	× 1.4
DFA_KVStore/add_transition	0.41	12.59	× 30.7	12.59	× 30.7
DFA_KVStore/del_transition	0.66	9.95	× 15.1	9.93	× 15.0
DFA_Graph/add_transition	0.64	8.00	× 12.5	8.75	× 13.7
DFA_Graph/del_transition	1.19	11.63	× 9.8	10.28	× 8.6
ConnectedGraph_Set/singleton	0.18	54.51	× 302.8	2.82	× 15.7
ConnectedGraph_Set/add_node	0.83	O/M	O/M	8.31	× 10.0
ConnectedGraph_Set/add_transition	0.91	O/M	O/M	10.29	× 11.3
ConnectedGraph_Graph/singleton	0.72	1.28	× 1.8	1.25	× 1.7
ConnectedGraph_Graph/add_node	1.25	7.65	× 6.1	7.61	× 6.1
ConnectedGraph_Graph/add_transition	1.50	23.93	× 16.0	31.09	× 20.7
ColoredGraph_Graph/add_edge	2.31	T/O	T/O	T/O	T/O
ColoredGraph_KVStore/add_edge	4.88	T/O	T/O	T/O	T/O
LinkedList_KVStore/remove_aux	4.98	T/O	T/O	N/A	N/A

The data in the table is produced using org-babel shipped with Emacs. With a recent installation of Emacs (for scripting) and pandoc (for conversion to markdown), one may rerun all experiments (takes about 20 minutes on AMD Ryzen 7 PRO 6850U) and regenerate this file, including the above table, using the following shell commands.

emacs --batch -l ob -l ob-shell --eval "
  (let ((org-confirm-babel-evaluate nil))
    (dolist (file command-line-args-left)
  (with-current-buffer (find-file-noselect file)
    (org-table-recalculate-buffer-tables)
    (save-buffer))))
" README.org
pandoc -s README.org --to=gfm -o README.md

Name		Name	Last commit message	Last commit date
Latest commit History 354 Commits
bin		bin
coersion		coersion
data		data
desymbolic		desymbolic
dtree		dtree
env		env
example		example
formalization		formalization
frontend		frontend
language		language
normalization		normalization
ntypecheck		ntypecheck
scripts		scripts
smtquery		smtquery
srlexec		srlexec
subtyping		subtyping
syntax		syntax
typecheck		typecheck
.envrc		.envrc
.gitignore		.gitignore
README.md		README.md
README.org		README.org
coreeff.opam		coreeff.opam
dune		dune
dune-project		dune-project
meta-config.json		meta-config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Started

Falsify Individual ADT implementations

Experiments

About

Releases

Packages

Contributors 2

Languages

yongweiy/HATch

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Falsify Individual ADT implementations

Experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages