Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma,...

48
Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

Transcript of Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma,...

Page 1: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken

PLDI, Santa Barbara, June 16, 2016

Page 2: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

2

Automatically Reason about

Programs

Symbolic Execution

Program Verification

Program Equivalence

…𝜙 ≡

Page 3: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

Automatically reasoning about programs requires

3

Page 4: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

testq %rdi, %rdi

je .L1

xorq %rax, %rax

.L0:

movq %rdi, %rdx

andq $0x1, %rdx

addq %rdx, %rax

shrq $0x1, %rdi

jne .L0

cltq

retq

.L1:

xorq %rax, %rax

retq

4

Page 5: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

5

addq $0x1, %rax rax ← rax +64 164

64-bit bit-vector addition

64-bit constant

previous value of rax

Page 6: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

6

addq $0x1, %rax rax ← rax +64 164

al ← al +8 18addb $0x1, %al

Page 7: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

7

addq $0x1, %rax rax ← rax +64 164

addb $0x1, %al

eax 32 bits

ax 16 bits

alah

rax

al ← al +8 18

8 bits

64 bits

8 bits

Page 8: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

8

addq $0x1, %rax rax ← rax +64 164

addb $0x1, %al

eax 32 bits

ax 16 bits

alah

rax

al ← al +8 18

8 bits

64 bits

8 bits

rax ← rax 63: 8 ∘ rax 7: 0 +8 18

Page 9: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

9

rax ← rax 63: 8 ∘ rax 7: 0 +8 18

rax ← rax[63: 32] ∘ (rax[31: 0] +32 132)

addw $0x1, %ax rax ← rax 63: 16 ∘ rax 15: 0 +16 116

addl $0x1, %eax

addq $0x1, %rax rax ← rax +64 164

addb $0x1, %al

Page 10: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

10

rax ← rax 63: 8 ∘ rax 7: 0 +8 18

rax ← 032 ∘ (rax[31: 0] +32 132)

addw $0x1, %ax rax ← rax 63: 16 ∘ rax 15: 0 +16 116

addl $0x1, %eax

addq $0x1, %rax rax ← rax +64 164

addb $0x1, %al

Page 11: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

11

rax ← rax 63: 8 ∘ rax 7: 0 +8 18

rax ← 032 ∘ (rax[31: 0] +32 132)

addw $0x1, %ax rax ← rax 63: 16 ∘ rax 15: 0 +16 116

addl $0x1, %eax

addq $0x1, %rax rax ← rax +64 164

addb $0x1, %al

zf ← 032 = (eax +32 132)

cf ← 01 ∘ eax +33 133 [32,32]

sf ← eax +32 132 [31,31]

of ← ¬eax 31,31 ∧ (eax +32 132)[31,31]

pf ← (eax +32 132)[0,0] ⊕ (eax +32 132)[1,1] ⊕

(eax +32 132)[2,2] ⊕ (eax +32 132)[3,3] ⊕

(eax +32 132)[4,4] ⊕ (eax +32 132)[5,5] ⊕

(eax +32 132)[6,6] ⊕ (eax +32 132)[7,7]

Page 12: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Manual partial specifications

– CompCert [CACM’09], BAP [CAV’11], BitBlaze [ICISS’08], Codesurfer/x86 [ETAPS’05], McVeto [CAV’10], STOKE [ASPLOS’13], Jakstab [CAV’08], many others

• Taly/Godefroid [PLDI’12]

– Automatically synthesize specification from templates

– Only 534 instructions

13

Page 13: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

14

Bit-vector formulas of input-output behavior

Page 14: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

15

Base set

Specify manually

Remaining Instructions

Learn specification automatically

All instructions

Page 15: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

16

Instruction 𝑖 Program 𝑝synthesize

combine base formulas

Formula 𝜙

Formal guarantee?

𝑖 ≡ 𝜙

How do we synthesize

programs?

Page 16: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

17

How do we synthesize

programs?

Randomized search

Guided by cost function

Based on test-cases

Using STOKE [ASPLOS’13]

Instruction 𝑖 Program 𝑝synthesize

combine base formulas

Formula 𝜙

Page 17: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

18

Instruction 𝑖 Program 𝑝synthesize

combine base formulas

Formula 𝜙

Formal guarantee?

𝑖 ≡ 𝜙

𝑝 ≡ 𝜙

Page 18: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

19

Instruction 𝑖 Program 𝑝synthesize

combine base formulas

Formula 𝜙

Formal guarantee?

𝑖 ≡ 𝜙

𝑖 ≡ 𝑝 ≡ 𝜙

Page 19: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

20

Instruction 𝑖 Program 𝑝synthesize

combine base formulas Candidate

formula 𝜙

Formal guarantee?

𝑖 ≡ 𝜙

𝑖 ≡ 𝑝 ≡ 𝜙

Page 20: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

21

Instruction 𝑖 Program 𝑝synthesize

combine base formulas

Program 𝑝′

𝜙 ֞?𝜙′

yes

no

✔ increase confidence

Add counter example, remove wrong program(s)

Candidateformula 𝜙

Candidateformula 𝜙′

Candidateformula 𝜙′′

Page 21: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

22

𝜙֞?𝜙′

Increase confidence

Remove incorrect program(s)

No information about equivalence

Page 22: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

23

𝜙֞?𝜙′

Increase confidence

Remove incorrect program(s)

No information about equivalence

Page 23: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

24

𝜙֞?𝜙′

Increase confidence

Remove incorrect program(s)

No information about equivalence

Equivalence class 1

Equivalence class 2

Page 24: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

25

• Prefer programs whose formulas are

– Precise (fewest uninterpreted functions)

– Fast (fewest non-linear arithmetic operations)

– Simple (fewest nodes)

Equivalence class 1

Equivalence class 2

Equivalence class 3

Page 25: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

26

• Prefer programs whose formulas are

– Precise (fewest uninterpreted functions)

– Fast (fewest non-linear arithmetic operations)

– Simple (fewest nodes)

Equivalence class 1

Equivalence class 2

Equivalence class 3

Page 26: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

27

synthesize

Page 27: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

28

Page 28: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

29

addw %ax, %dx dx ← dx +16 ax

addw %cx, %bx

Learn

bx ← bx +16 cx

Rename

addw (%rsp), %dx dx ← dx +16 M rsp

addw $0x5, %dx dx ← dx +16 516

Page 29: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

1. Learn formula for register-only instructions

2. Generalize formulas

‐ To other types of operands

3. Check on test inputs

30

Page 30: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

31

shufps $0xb3, %xmm0, %xmm1

Solution: Brute force a formula for every constant

Problem: No corresponding register-only variant

Page 31: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Base set (51 instructions)

– Integer, bitwise and float operations

– Data movement (including conditional move)

– Conversion operations

• Pseudo instructions (11 templates)

– Split and combine registers

– Changing status flags

32

Page 32: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Total instructions 3,684

• Out-of-scope

– System instructions 302

– Crypto instructions 35

– Deprecated instructions 332

– String instructions 97

• Goal instructions 2,918

33

invpcid, jle

aeskeygenassist

fadd

scasq

Page 33: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Base set 51

• Pseudo instructions 11

• Register-only instructions learned 692

• Generalized 984

• 8-bit constant instructions learned 119.42

• Total formulas learned 1,795.42

34

Page 34: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

35

Compare with handwritten formulas (from STOKE)

Available for comparison 1,431.91

Automatically proven equivalent

Equivalent with additional lemma

1,377.91

4

Page 35: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

1,431.91

1,377.91

4

36

Compare with handwritten formulas (from STOKE)

Available for comparison

Automatically proven equivalent

Equivalent with additional lemma

fadd 𝑎, 𝑏 = fadd 𝑏, 𝑎

Page 36: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

37

Compare with handwritten formulas (from STOKE)

Available for comparison

Automatically proven equivalent

Equivalent with additional lemma

Semantically different

Handwritten formula correct

Learned formula correct

50

0

50

1,431.91

1,377.91

4

Page 37: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

38

stratum 𝑖 = ൝0 if 𝑖 ∈ baseset1 + max

𝑖′∈𝑀(𝑖)stratum i′ otherwise

Stratum 0 Stratum 1 Stratum 2 Stratum 3

base set

Page 38: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

39

stratum 𝑖 = ൝0 if 𝑖 ∈ baseset1 + max

𝑖′∈𝑀(𝑖)stratum i′ otherwise

Page 39: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

40

0

100

200

300

400

500

600

700

800

0 50 100 150 200 250

Nu

mb

er

of

form

ula

s le

arn

ed

Wall-clock time elapsed [hours]

Stratification Without stratification

Page 40: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

41

number of nodes in learned formula

number of nodes in handwritten formula

Fully inlined: 3526 instructions

Page 41: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

1. Automatically learned 1,795 formulas

2. Stratification key to scale program synthesis

3. Compare to hand-written specification

‐ More correct, equally precise, same size

Source code, formulas, experimental results

42

https://github.com/StanfordPL/strata/

Page 42: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

43

Page 43: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

1. Missing base instructionsSome integer and floating point operations are missing

2. Program synthesis limitsShortest known program is long and outside of reach

e.g., byte-vectorized operation

3. Cost function limitationFor one bit of output, the cost function does not give enough signal

4. Crazy instructions

44

Page 44: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Total decisions 7,075

• Equivalent 6,669 (94.26%)

• New equivalence class 356 (5.03%)

• Counter-examples 50 (0.71%)

• Timeouts (45 seconds): 3

45

Page 45: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Intel Xeon E5-2697 (28 cores) at 2.6 GHz

– 268.86 hours (register-only)

– 159.12 hours (8-bit constants)

• Total of 11,983.37 core hours

46

Page 46: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Random inputs (random machine state)

• “Interesting” bit-patterns

0, 1, −1, 2𝑛, NaN, Infinity

• Test cases learned from counter-examples

47

Page 47: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Formulas are simplified

– Constant propagation

– Move bit-selection over concatenation

264 ∗64 464≡ 864

064 ∘ rax 63,0 ≡ rax

48

Page 48: Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken · Stefan Heule, Eric Schkufza, Rahul Sharma, Alex Aiken PLDI, Santa Barbara, June 16, 2016

• Formula precision (number of uninterpreted functions)

– Learned formulas equally precise in all but 4 cases

• Formula quality (number of non-linear operations)

– Learned formulas contain same number of non-linear operations, except for 11 cases

49