OTAWA
2.0
Framework to perform machine analysis and compute WCET.
|
OTAWA provides an abstraction layer between the actual binary program and static analyses. This abstraction is composed of generic information about the processed instruction: type of instruction, target for branch instruction, read and writtent registers, and so on. More...
Classes | |
class | otawa::sem::inst |
class | otawa::sem::PathIter |
class | otawa::sem::Printer |
Functions | |
cond_t | otawa::sem::invert (cond_t cond) |
Invert the given condition. More... | |
bool | otawa::sem::isSigned (cond_t cond) |
Test if a condition concerns signed numbers or not. More... | |
bool | otawa::sem::isUnsigned (cond_t cond) |
Test if a condition concerns unsigned numbers or not. More... | |
int | otawa::sem::size (type_t type) |
Get the size of the given type. More... | |
OTAWA provides an abstraction layer between the actual binary program and static analyses. This abstraction is composed of generic information about the processed instruction: type of instruction, target for branch instruction, read and writtent registers, and so on.
To perform value or interval analysis, we need to understand the processing of instructions. To maintain this abstraction layer, OTAWA provides with some architecture plugins a way to translate instruction into a list of architecture-independent instructions to let value analyses to be independent of the actual architecture.
Designing a language to describe any instruction set is may an impossible task. In the opposite, our semantics language has for goal to allow to perform analyses while retaining only important things for such a task: basic arithmetics and comparison for integer and address and integer computation, minimal flow of control to allow fast analysis.
Basically, a machine instruction I is translated into a block of n semantic instructions [i0, i1, i2, ..., in-1]. To interpret a semantic block, one has to consider a semantic instruction counter, p. At beginning of the interpretation, p = 0. According to the interpreted instructions, p is simply incremented (+1) to perform sequential execution, or changed by adding a positive quantity n to avoid executing some instructions. Notice that n >= 0 induces that no loop can appear in the interpretation of semantic block and hence no fix point is required. The execution stops as soon as p >= n.
A interpretation of a semantic i, |[i]|, can be viewed as a function taking as input (p, s) with s the state of the program and producing a new (p', s'):
|[i]|: (p, s) -> (p', s')
The state s is a an instance of possible states, that is, of functions mapping registers R, temporaries T and memory addresses A to a value in domain V:
S: R U T U A -> V
Set of registers depends on the underlying architecture but is mapped to IN. Temporaries are a subset of negative integers but A currently represents address on 32-bits. For the remaining of the document, s[i] represents the value of V in the state s that matches i, that may be a register, a temporary or an address. s[x / y] represents a new state where value of y is x, that is, s' = s[x / y] = lambda i. if i = y then x else s[i].
The algorithm to interpret a semantic block follows (s0 is the initial state):
The following sections describes the semantics of each instruction.
Basically, our semantic instructions have three operands and works with registers. The first operand d is the target registers to store the result and next ones represents source operands (a and b). The following arithmetics operations exists:
The following instructions represents unary operations applied on the a register and stores the result on the d register.
Below is given the semantics of each instruction where:
There are four set instructions:
set
, seti
, scratch
, setp
(very rarely used).Their semantics is described below (i is a 32-bits value):
In semantic instructions, there are two types of control flow: machine flow corresponds to PC assignment in the underlying instruction set and is supported by CFG; semantic flow that controls the execution flow of semantic instruction. The only instruction concerning the machine control flow is branch
that gives the new address stored in the PC but it doesn't change the semantic instruction counter p.
The semantic control flow is handled by three instructions, cont
that stops the execution, fork
that creates two executions paths and assume
that assumes a condition true. The condition used in assume
is generated from a comparison result, a ~ b (a compared to b), generated by either a cmp
instruction, or @cmpu instruction. The semantics of the condition are:
Below is the semantics of the flow instructions (c is a condition and c[a] returns true if condition c is satisfied in register a):
In OTAWA 2, if
is considered deprecated as it the composition of fork
and assume
. However, some instruction sets may keep to use it so its semantics is given below:
There are only two instructions used to perform memory access:
load
allows to get a value from memory using its address and its type, store
allows to store a register value to a memory address using the given type.Notice that the types are required to realize the bytes modified in memory and to generate correct value in register (including sign extension). The types may be one of:
The semantics of these instructions follows (t is the type, s(t) represents access to state with type t):
There is also a @spec semantics instruction kind that must be used by instruction effects not tractable with the current semantics instruction set. One using the SPEC instructions must be aware that standard usual analyses will not cope with such instructions: they will need to be customized. Therefore, the semantics of spec
is undefined.
Operators used in the instruction (a, b and d) represents unique representation of registers (as returned by Register::platformNumber()) for positive number or temporaries when negative numbers are used.
Temporary values are useful when the semantics expression of an instruction is complex and requires several temporary results. To alleviate the management of temporaries, they are easy identified as they are represented as negative numbers and their maximum number is provided by the Process::tempMax(). Please, notice that the liveness of a temporary must not expand out of the semantics block of an instruction !
Static analyses using instruction semantics are usually called data-flow analysis. Using Abstact Interpretation, the interpretation domain is usually an abstraction of the program variables state. The state includes usually the value of the registers (simply identified by the register platform number) and the addresses of used locations in memory. Register numbers and memory location addresses grouped together forms the address set.
So, the state becomes usually maps from addresses to value. OTAWA provides already several representations for these maps.The next step is to define the abstraction of the values: values are stored in registers and in memory and must be specialized according to the performed analysis. For example, the CLP (Cycle Linear Progression) analysis, the values are represented as triplets (b, d, n) representing a set of integers (and adresses) from the set {b + i d / 0 <= i < n}. But, it may be any value you want, adapted to your analysis. Whatever, one must remark that addresses are usually storable in registers and in memory, the value set must provides a way to represent them.
The usual map representations provide already functions to perform abstract interpretation (bottom value, initial value, update, join, etc). In the case of the value, you have also to provide function to perform abstract interpretation but also functions to interpret the different semantic instructions.
To help the developer supports the multiple execution path of a semantic bloc, one can use the class otawa::sem::PathIter that works like a usual iterator but provides also indications of the executed paths and instructions:
if
, meaning that two paths will start from this point.A common way to use otawa::sem::PathIter is to maintain a stack of states where the current state is pushed when an if
is found and a state is popped when an execution path ends. The different states obtained at each end of an execution path can be joined:
A specific processing is devoted to SPEC instructions. A convenient analysis must let its user to specialize it in order to support these instructions. To achieve this goal, it must provide in the analysis a virtual function that is called each time the SPEC instruction is interpreted. It would be useful if this function takes as parameter the real instruction, the semantics instruction and the current abstract state. In the initial analysis, this function simply do nothing but it lets customizer to overload it in order to customize the interpretation.
Invert the given condition.
cond | Condition to invert. |
References otawa::sem::ANY_COND, otawa::se::EQ, otawa::se::GE, otawa::se::GT, otawa::se::LE, otawa::se::LT, otawa::sem::MAX_COND, otawa::se::NE, otawa::sem::NO_COND, otawa::se::UGE, otawa::se::UGT, otawa::se::ULE, and otawa::se::ULT.
Referenced by otawa::sem::Block::fork(), and otawa::se::FilterBuilder::prepareSemBlockPaths().
Test if a condition concerns signed numbers or not.
EQ and NE are considered to apply equally on signed and unsigned numbers.
cond | Type to test. |
References otawa::se::EQ, and otawa::se::NE.
Test if a condition concerns unsigned numbers or not.
EQ and NE are considered to apply equally on signed and unsigned numbers.
cond | Type to test. |
References otawa::sem::ANY_COND, and otawa::se::EQ.
int otawa::sem::size | ( | type_t | type | ) |
Get the size of the given type.
type | Type to get size for. |
References otawa::type().