-
Notifications
You must be signed in to change notification settings - Fork 474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stateful SMT Memory Model #1021
base: master
Are you sure you want to change the base?
Conversation
Sorry for the delay @mrphrazer |
As @serpilliere remarks, we didn't give you feedback for now. Here is my current remarks / questions:
Code remarks: thanks for commenting it out, it is so much easier to review :) |
return ExprAssign(mem, op) | ||
|
||
|
||
def zero_padding(v, arch_size): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds like Expr.zeroExtend(arch_size)
You might be interested in the Core Theory from the Binary Analysis Platform. |
Hi!
This PR introduces a stateful SMT memory model.
What doe this mean?
Similar to SSA, we can make memory reads/writes stateful and use it for SMT-based reasoning.
For this, we have a memory variable
M
as well as two operations:mem_read(memory_state, address, size_to_read)
mem_write(memory_state, address, value, size_to_write)
Common memory access patterns can be rewritten as follows:
Afterwards, SSA can be applied:
Afterwards, we add every IR instruction to the SMT solver instead of only jump conditions. This allows the SMT solver to analyze and reason about the whole program space (often in a much faster manner, see above).
Why is this useful?
The current symbolic execution/z3 memory model does not contain any form of state and is prone to aliasing. Basically, this PR enables us to constraint memory values for any arbitrary state. This can be useful in different contexts. I will discuss a few of them:
Memory Aliasing
Given the following code and the initial symbolic map:
{rbx: rbx, rax: rax}
Then,
rcx
will always contain0x33
, since the SE and the SMT solver are not aware about potential aliasing issues. However, in the stateful memory model, the SMT solver is able to find a solution for both cases and decides own its own what is more likely the case.Symbolic Execution
Large pieces of code can slow down symbolic execution enormously. Take for instance the following example. For this, symbolic execution takes ~ 90 seconds and the SMT solving itself ~ 1 second. Instead, in the SMT-based memory model, the whole process takes nearly ~ 1 second. What is happening? We avoid symbolic execution. Instead, the SMT solver takes as input the whole basic block slice (instead of only the jump condition) and decides on its own which parts are required to find a solution and which parts can be omitted. As a result, a satisfiable model is equivalent to a whole slice to all register and memory states that are responsible for its decision.
Other
Other use cases can be exploit development (given a path to a exploitable vulnerability, can we craft an input such that our final memory contains the following shell code <...>), checks for integer overflows, ...
Open Problems
As stated in #1004, a questions remains how the memory rewrite pass can be implemented in a clean manner. The current solutions patches the simplify function in
AssignBlock
andIRBlock
and shouldn't stay as it is.