This documentation is still undergoing construction, if you find any issues or have any suggestions, please feel free to open an issue or reach out to me.
SWAT (Symbolic Web Application Testing)
SWAT is a loosely coupled dynamic symbolic execution engine for Java applications. The engine is based on CATG but its design and implementation has grown beyond the initial functionality. Unlike many existing engines, SWAT does not actively drive symbolic execution. Using instrumentation during load-time, a _Javaagent_ adds symbolic observation capabilities to the _System under Test_ (SuT). After instrumentation, during normal execution a symbolic shadow-state is maintained that records the executed path and all symbolic constraints and modifications encountered during execution.
To achieve this, the system is split into two major components:
- Symbolic Executor: This component is responsible for the symbolic constraint generation inside the JVM running the SuT.
- Symbolic Explorer: This component is responsible for recording visited paths and symbolic constraints in a dynamic execution tree. In addition, the explorer finds unexplored branches and interacts with an SMT Solver to generate concrete values that satisfy the constraints.
The conceptual idea of both components and their interaction with the SuT is explained below alongside a high-level example.
System under Test
The SuT can be a simple Java application or a complex web application. The system needs to be executable (no missing dependencies or missing components such as databases). As the SuT is natively executed the symbolic execution still works when the SuT (needs to) communicate with external components. However, if symbolic values leave the JVM, the value can no longer be symbolically tracked.
As the SuT is instrumented during load-time SWAT does not need source-code access! It directly instruments the .class
files. Consequently the instrumentation happens at the JVM Bytecode level. While the intricacies concerning what exactly is instrumented and what is symbolically tracked is explained later in further detail, here lets assume that the following instructions would be instrumented and symbolically tracked:
Given the small bytecode snippet, a classical branching point can be seen in line 18
, where a value is checked against zero (IFNE
). This causes one trace (pink) to continue and end at the return statement (IRETURN
), while the other trace (green) continues to the division (IDIV
) and so on. While this is a simplified use-case with some missing details it should give a good idea of what is happening at the bytecode level.
Symbolic Executor
Given the SuT the symbolic executor maintains a shadow state of the actual execution to manage symbolic handling. This is done by inserting additional (non-interfering) instructions into the bytecode. This is explained in more detail in the page on Instrumentation.
Each execution of the SuT leads to exactly one trace. The two traces resulting from the above bytecode snippet are shown in the figure below.
For each trace SWAT records a unique instruction identifier (IID
), the symbolic branching condition and the branching decision (Branched
). These traces are recorded and sent to the Symbolic Explorer after the execution left the symbolic scope.
Symbolic Explorer
The symbolic explorer is written in Python and spawns a REST API
server using FastAPI
. The server exposes endpoints for receiving (trace) information from the Symbolic Executor and for interacting with the explorer. More information on these endpoints can be found in the API Documentation. When traces are received, they are inserted into an execution tree that is used to explore the symbolic execution space. Using a depth-first search, the explorer traverses the tree and identifies all branching points that contain symbolic conditions and where one child is still unexplored (missing). By collection all constraints from the root to the current node, the explorer can generate an SMT problem that is solved using the an SMT solver (currently Z3) but other solvers are generally also supported as SWAT utilizes the SMT-lib v2
format.
SWAT is developed by the Institute for IT Security at the University of Lübeck.