Skip to content

Commit 75355cb

Browse files
authored
Merge pull request #58 from ftsrg/bitvectors
Basic bitvector support
2 parents ccd1970 + ff73098 commit 75355cb

File tree

65 files changed

+4833
-152
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

65 files changed

+4833
-152
lines changed

build.gradle.kts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ buildscript {
1010

1111
allprojects {
1212
group = "hu.bme.mit.inf.theta"
13-
version = "1.3.0"
13+
version = "1.4.0"
1414

1515
apply(from = rootDir.resolve("gradle/shared-with-buildSrc/mirrors.gradle.kts"))
1616
}

subprojects/cfa/README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,16 +39,19 @@ Variables of the CFA can have the following types.
3939
- `int`: Mathematical, unbounded SMT integers.
4040
- `rat`: Rational numbers (implemented as SMT reals).
4141
- `[K] -> V`: SMT arrays (associative maps) from a given key type `K` to a value type `V`.
42+
- `bv[L]`, `bitvec[L]`, `ubv[L]`, `ubitvec[L]`, `sbv[L]`, `sbitvec[L]`: Signed or unsigned bitvector of given length `L`. _This is an experimental feature with currently limited algorithmic support. See the [details](doc/bitvectors.md) for more information._
4243

4344
Expressions of the CFA include the following.
4445
- Identifiers (variables).
4546
- Literals, e.g., `true`, `false` (Bool), `0`, `123` (integer), `4 % 5` (rational).
4647
- Array literals can be given by listing the key-value pairs and the (mandatory) default element, e.g., `[0 <- 182, 1 <- 41, default <- 75]`. If there are no elements, the key type has to be given before the default element, e.g., `[<int>default <- 75]`.
48+
- Bitvector literals can be given by stating the length, information about the signedness, and the exact value of the bitvector in binary, decimal or hexadecimal form. (E.g. `4'd5` is a 4-bit-long unsigned bitvector with the decimal value 5.) _This is an experimental feature with currently limited algorithmic support. See the [details](doc/bitvectors.md) for more information._
4749
- Comparison, e.g., `=`, `/=`, `<`, `>`, `<=`, `>=`.
4850
- Boolean operators, e.g., `and`, `or`, `xor`, `not`, `imply`, `iff`.
4951
- Arithmetic, e.g., `+`, `-`, `/`, `*`, `mod`, `rem`.
5052
- Conditional: `if . then . else .`
5153
- Array read (`a[i]`) and write (`a[i <- v]`).
54+
- Bitvector specific operators, e.g., `&`, `|`, `^`, `<<`, `>>`, `~`. _This is an experimental feature with currently limited algorithmic support. See the [details](doc/bitvectors.md) for more information._
5255

5356
### Textual representation (DSL)
5457

subprojects/cfa/doc/bitvectors.md

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# Bitvector support in Theta
2+
3+
As of now, Theta has basic bitvector support for the CFA formalism.
4+
5+
## Overview
6+
7+
In Theta, every bitvector has a length, and is either signed or unsigned. It follows, that the range of every bitvector has a size of 2<sup>n</sup>. There are different operations that are available for bitvectors. It is important to note, that only operations between bitvectors with the same size and signedness are valid.
8+
9+
Bitvectors have n bits. If the bitvector is unsigned then the values of the bits come from the binary representation of the underlying number. However, if the bitvector is signed then the values of the bits come from the two's complement representation of the underlying number.
10+
11+
## Declaring bitvector variables
12+
13+
To declare a bitvector variable, one has to specify the size and the signedness.
14+
15+
```
16+
var x1 : bv[4] // Unsigned 4-bit-long bitvector
17+
var x2 : bitvec[5] // Unsigned 5-bit-long bitvector
18+
19+
var u1 : ubv[4] // Unsigned 4-bit-long bitvector
20+
var u2 : ubitvec[6] // Unsigned 6-bit-long bitvector
21+
22+
var s1 : sbv[4] // Signed 4-bit-long bitvector
23+
var s2 : sbitvec[7] // Signed 7-bit-long bitvector
24+
25+
```
26+
27+
## Bitvector literals
28+
29+
There is a new literal, the bitvector literal that can be used to create bitvectors. Each literal defines the size and signedness of the literal. Moreover, eah literal can be entered using three different formats:
30+
31+
- The bitwise precise binary form
32+
- The bitwise precise hexadecimal form
33+
- And the non-bitwise-precise, although user-friendly decimal form
34+
35+
The two bitwise precise forms specify the value for all bits directly. These are useful for unsigned bitvectors, where there is no two's complement behavior (e.g. bitfields).
36+
37+
```
38+
4'b0010 // Unsiged 4-bit-long bitvector literal (binary form)
39+
8'xAF // Unsiged 4-bit-long bitvector literal (hexadecimal form)
40+
41+
4'bu0010 // Unsiged 4-bit-long bitvector literal (binary form)
42+
8'xuAF // Unsiged 4-bit-long bitvector literal (hexadecimal form)
43+
44+
4'bs0010 // Signed 4-bit-long bitvector literal (binary form, not recommended)
45+
8'xsAF // Signed 4-bit-long bitvector literal (hexadecimal form, not recommended)
46+
```
47+
48+
The non-bitwise-precise decimal form can be used to create bitvectors that are based on numbers. Thsi form is recommended for signed bitvectors, or unsigned bitvectors that are not bitfields.
49+
50+
```
51+
4'd10 // Unsigned 4-bit-long bitvector literal (decimal form)
52+
53+
4'du10 // Unsigned 4-bit-long bitvector literal (decimal form)
54+
55+
4'ds5 // Signed 4-bit-long bitvector literal (decimal form, positive value)
56+
4'ds-5 // Signed 4-bit-long bitvector literal (decimal form, negative value)
57+
```
58+
59+
## Operations on bitvectors
60+
61+
The following operations are defined on bitvectors. As a general rule, every binary operation requires the bitvector on the left hand side and the bitvector on the right hand side to have the same size and signedness.
62+
63+
The operators and their precedence are based on the [operators in C langauge](https://en.cppreference.com/w/c/language/operator_precedence).
64+
65+
### Basic arithmetic operations
66+
67+
These operations perform basic arithmetic operations on bitvectors. These operations require that the bitvector on the left hand side and the bitvector on the right hand side have the same size and signedness. These operations overflow!
68+
69+
- **Addition:** Adds two bitvectors; `a + b`
70+
- **Subtraction:** Subtracts a from b; `a - b`
71+
- **Multiplication:** Multiplies two bitvectors; `a * b`
72+
- **Integer division:** Divides two bitvectors, and takes the integer part of the result; `a / b`
73+
- **Modulo:** Calculates (a mod b); `a mod b`
74+
- **Remainder:** Calculates the remainder of a / b; `a rem b`
75+
- **Negate:** Negates the value of a (only valid for signed bitvectors); `-a`
76+
77+
### Bitvector specific operations
78+
79+
These operations are specific to bitvectors only. These operations require that the bitvector on the left hand side and the bitvector on the right hand side have the same size and signedness. For the exact semantics check out the respective operators in C. These operations overflow!
80+
81+
- **Bitwise and**: Ands two bitvectors; `a & b`
82+
- **Bitwise or**: Ors two bitvectors; `a | b`
83+
- **Bitwise xor**: XORs two bitvectors; `a ^ b`
84+
- **Bitwise shift left**: Shifts a to left with b; `a << b`
85+
- **Bitwise shift right**: Shifts a to right with b; `a >> b`
86+
- **Bitwise not:** Negates all the bits in bitvectors; `~a`
87+
88+
### Relational operations
89+
90+
These operations encode relational operations between bitvectors. These operations require that the bitvector on the left hand side and the bitvector on the right hand side have the same size and signedness.
91+
92+
- **Equality**: Checks if a equals to b; `a = b`
93+
- **Non-equality**: Checks if a does not equal to b; `a /= b`
94+
- **Greater than or equals to**: Checks if a is greater than or equals to b; `a >= b`
95+
- **Greater than**: Checks if a is greater than b; `a > b`
96+
- **Less than or equals to**: Checks if a is less than or equals to b; `a <= b`
97+
- **Less than**: Checks if a is less than b; `a < b`
98+
99+
### Conversion "operations"
100+
101+
Bitvectors can be converted to integers, and vice-versa. However, since integers can have an arbitrily huge value, should that value be impossible to be represented in the bitvectors range, an exception will be thrown. So procede with caution!
102+
103+
There is no explicit conversion operator, however, the assignment statement converts between the two types.
104+
105+
```
106+
var bv1 : bv[4]
107+
var i1 : int
108+
109+
// ...
110+
111+
L0 -> L1 {
112+
bv1 := i1
113+
i1 := bv1
114+
}
115+
```
116+
117+
## Algorithmic support for verification with bitvectors
118+
119+
As of now, bitvectors are only partially supported due to the fact, that the underlying SMT solver, Z3 does not support interpolation when bitvectors are involved.
120+
121+
As a result, CEGAR with predicate abstraction is not supported at all. However, CEGAR with explicit value abstraction, and with the UNSAT core refinement strategy is working preoperty, as it does not rely on the interpolation capabilities of the SMT solver.

subprojects/cfa/src/main/antlr/CfaDsl.g4

Lines changed: 83 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@ type: boolType
8080
| ratType
8181
| funcType
8282
| arrayType
83+
| bvType
8384
;
8485

8586
typeList
@@ -106,6 +107,19 @@ arrayType
106107
: LBRACK indexType=type RBRACK RARROW elemType=type
107108
;
108109

110+
bvType
111+
: ubvType
112+
| sbvType
113+
;
114+
115+
ubvType
116+
: UBVTYPE LBRACK size=INT RBRACK
117+
;
118+
119+
sbvType
120+
: SBVTYPE LBRACK size=INT RBRACK
121+
;
122+
109123
BOOLTYPE
110124
: 'bool'
111125
;
@@ -118,6 +132,18 @@ RATTYPE
118132
: 'rat'
119133
;
120134

135+
UBVTYPE
136+
: 'bv'
137+
| 'bitvec'
138+
| 'ubv'
139+
| 'ubitvec'
140+
;
141+
142+
SBVTYPE
143+
: 'sbv'
144+
| 'sbitvec'
145+
;
146+
121147
// E X P R E S S I O N S
122148

123149
expr: funcLitExpr
@@ -181,9 +207,25 @@ equalityExpr
181207
;
182208

183209
relationExpr
184-
: leftOp=additiveExpr (oper=(LT | LEQ | GT | GEQ) rightOp=additiveExpr)?
210+
: leftOp=bitwiseOrExpr (oper=(LT | LEQ | GT | GEQ) rightOp=bitwiseOrExpr)?
185211
;
186212

213+
bitwiseOrExpr
214+
: leftOp=bitwiseXorExpr (oper=BITWISE_OR rightOp=bitwiseXorExpr)?
215+
;
216+
217+
bitwiseXorExpr
218+
: leftOp=bitwiseAndExpr (oper=BITWISE_XOR rightOp=bitwiseAndExpr)?
219+
;
220+
221+
bitwiseAndExpr
222+
: leftOp=bitwiseShiftExpr (oper=BITWISE_AND rightOp=bitwiseShiftExpr)?
223+
;
224+
225+
bitwiseShiftExpr
226+
: leftOp=additiveExpr (oper=(BITWISE_SHIFT_LEFT | BITWISE_SHIFT_RIGHT) rightOp=additiveExpr)?
227+
;
228+
187229
additiveExpr
188230
: ops+=multiplicativeExpr (opers+=(PLUS | MINUS) ops+=multiplicativeExpr)*
189231
;
@@ -193,10 +235,15 @@ multiplicativeExpr
193235
;
194236

195237
negExpr
196-
: accessorExpr
238+
: bitwiseNotExpr
197239
| MINUS op=negExpr
198240
;
199241

242+
bitwiseNotExpr
243+
: accessorExpr
244+
| BITWISE_NOT op=bitwiseNotExpr
245+
;
246+
200247
accessorExpr
201248
: op=primaryExpr (accesses+=access)*
202249
;
@@ -230,6 +277,7 @@ primaryExpr
230277
| intLitExpr
231278
| ratLitExpr
232279
| arrLitExpr
280+
| bvLitExpr
233281
| idExpr
234282
| parenExpr
235283
;
@@ -255,6 +303,10 @@ arrLitExpr
255303
| LBRACK LT indexType=type GT DEFAULT LARROW elseExpr=expr RBRACK
256304
;
257305

306+
bvLitExpr
307+
: bv=BV
308+
;
309+
258310
idExpr
259311
: id=ID
260312
;
@@ -342,6 +394,30 @@ PERCENT
342394
: '%'
343395
;
344396

397+
BITWISE_OR
398+
: '|'
399+
;
400+
401+
BITWISE_XOR
402+
: '^'
403+
;
404+
405+
BITWISE_AND
406+
: '&'
407+
;
408+
409+
BITWISE_SHIFT_LEFT
410+
: LT LT
411+
;
412+
413+
BITWISE_SHIFT_RIGHT
414+
: GT GT
415+
;
416+
417+
BITWISE_NOT
418+
: '~'
419+
;
420+
345421
TRUE: 'true'
346422
;
347423

@@ -401,6 +477,11 @@ RETURN
401477

402478
// B A S I C T O K E N S
403479

480+
BV : NAT '\'b' ('s'|'u')? [0-1]+
481+
| NAT '\'d' ('s'|'u')? INT
482+
| NAT '\'x' ('s'|'u')? [0-9a-fA-F]+
483+
;
484+
404485
INT : SIGN? NAT
405486
;
406487

0 commit comments

Comments
 (0)