联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp

您当前位置:首页 >> Python编程Python编程

日期:2022-04-24 10:56

Secure Software Engineering

Topic #20: Symbolic Execution

Static analysis

Can analyze all possible runs of a program

An explosion of interesting ideas and tools

Commercial companies sell, use static analysis

Great potential to improve software quality

But: Can it find deep, difficult bugs?

Yes, but not often

Commercial viability implies you must deal with developer

confusion, false positives, error management,..

This means that companies specifically aim to keep the

false positive rate down

- They often do this by purposely missing bugs, to keep

the analysis simpler

One issue: Abstraction

Abstraction lets us model all possible runs

But abstraction introduces conservatism

*-sensitivities add precision, to deal with this

* = flow-, context-, path-, etc.

But more precise abstractions are more expensive

- Challenges scalability

- Still have false alarms or missed bugs

Static analysis abstraction ≠ developer abstraction

Because the developer didn’t have them in mind

Symbolic Execution

Testing works: reported bugs are real bugs

But, each test only explores one possible execution

- assert(f(3) == 5)

We hope test cases generalize, but no guarantees

Symbolic execution generalizes testing

“Allows unknown symbolic variables α in evaluation

- y = α; assert(f(y) == 2*y-1);

If execution path depends on unknown, conceptually

fork symbolic executor

- int f(int x) { if (x > 0) then return 2*x - 1; else return 10; }

Symbolic Execution Example

1. int a = α, b = β, c = γ;

2. // symbolic

3. int x = 0, y = 0, z = 0;

4. if (a) {

5. x = -2;

6. }

7. if (b < 5) {

8. if (!a && c) { y = 1; }

9. z = 2;

10. }

11. assert(x+y+z != 3)

x=0, y=0, z=0

α

x=-2

z=2

?

β<5 ?α∧γ

y=1

β<5

z=2

z=2

t f

t f t f

t f

α∧(β<5)

path condition

α∧(β≥5) ?α∧(β≥5)

α∧(β<5)∧?γ

α∧(β<5)∧γ

Insight

Each symbolic execution path stands for many

actual program runs

In fact, exactly the set of runs whose concrete values

satisfy the path condition

Thus, we can cover a lot more of the program’s

execution space than testing

Viewed as a static analysis, symbolic execution

is

Complete, but not sound (usually doesn’t terminate)

Path, flow, and context sensitive

Symbolic Analysis

Static analysis considers all paths are feasible

Dynamic analysis considers one path or a number

of paths

Symbolic analysis reasons about path feasibility

Much more precise

Scalability is an issue

A lot of security applications

Exploit generation

Vulnerability detection

Expose hidden behavior

Verification

Equivalence checking for analyzing obfuscated code

Basic Idea

Explore individual paths in the program; models the

conditions and the symbolic values along a path to a

symbolic constraint; a path is feasible if the

corresponding constraint is satisfiable (SAT)

Similar to the per-path static analysis, a worklist is used

to maintain the paths being explored

Upon a function invocation, the current worklist is pushed

to a stack and a new worklist is initialized for path

exploration within the callee

Upon a return, the symbolic value of the return variable is

passed back to the caller

The idea is an old one

Robert S. Boyer, Bernard Elspas, and Karl N. Levitt.

SELECT–a formal system for testing and debugging

programs by symbolic execution. In ICRS, pages 234–

245, 1975.

James C. King. Symbolic execution and program testing.

CACM, 19(7):385–394, 1976. (most cited)

Leon J. Osterweil and Lloyd D. Fosdick. Program testing

techniques using simulated execution. In ANSS, pages

171–177, 1976.

William E. Howden. Symbolic testing and the DISSECT

symbolic evaluation system. IEEE Transactions on

Software Engineering, 3(4):266–278, 1977.

Why didn’t it take off?

Symbolic execution can be compute-intensive

Lots of possible program paths

Need to query solver a lot to decide which paths are

feasible, which assertions could be false

Program state has many bits

Computers were slow (not much processing

power) and small (not much memory)

Recent Apple iPads are as fast as Cray-2’s from the

80’s

Today

Computers are much faster, bigger

Better algorithms too: powerful SMT/SAT solvers

SMT = Satisfiability Modulo Theories = SAT++

Can solve very large instances, very quickly

Lets us check assertions, prune infeasible paths

2005-2006 reinterest in symbolic execution

Area of success: (security) bug finding

Heuristic search in space of possible executions

Find interesting bugs

Basic symbolic execution

12

Symbolic variables

Extend the language’s support for expressions e to

include symbolic variables, representing unknowns

Symbolic variables are introduced when reading input

Using mmap, read, write, fgets, etc.

So if a bug is found, we can recover an input that

reproduces the bug when the program is run normally

e ::= n | X | e0 + e1 | e0 ≤ e1 | e0 && e1 | …

n ∈ N = integers, X ∈ Var = variables, α ∈ SymVarα |

Symbolic Expressions

We make (or modify) a language interpreter to be

able to compute symbolically

Normally, a program’s variables contain values

Now they can also contain symbolic expressions

- Which are expressions containing symbolic variables

Example normal values:

5, “hello”

Example symbolic expressions:


版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp