313
edits
Changes
no edit summary
}
==Control Flow Graphs==
Abstract interpretation operates on a flowchart representation of the function called a '''control flow graph''' or '''CFG'''. Below is the CFG for the example function. This CFG is similar in flavor to the real GCC CFG, just with a few unimportant details removed.
* A '''program point''' is the (notional) point just before or after each instruction. The function has a well-defined state at program points, so our analysis facts will always refer to program points. E.g., at the program point just before 't1 := k + m', k is constant with value 2.
==Concrete Interpretation==
To explain abstract interpretation, we'll start with concrete interpretation, which is just the normal kind you are already familiar with. We'll use it to build up to abstract interpretation.
So k = 2 just before its use in 't1 := k + m'. We can run with many other inputs and get the same result. But testing can never tell us that k = 2 for all inputs. (Well, actually it can, but only if you can run <math>2^64</math> test cases.)
==Approaching Abstract Interpretation==
But if you look at the function, you can see that for k, there are really only two important cases: a == 0, and a != 0, and b doesn't matter. So let's (on paper) run two tests: one with input "a = 0, b = ?" and the other with "a = NZ, b = ?", where '''NZ''' means "any nonzero value" and '''?''' means "any value". First, a = 0:
Another problem is that we are analyzing each path through the function individually. But a function with <math>k</math> if statements can have up to <math>2^k</math> paths, and a function with loops has an infinite number of paths. So on real programs, we'll experience path explosion and we won't get complete coverage.
==Abstract Interpretation by Example==
One problem with our first approach to abstract interpretation was that we don't know how to pick what sets abstract values to use for input test cases. So let's not. Instead, let's just do one test with any input at all: "a = ?, b = ?":
EXIT
And we get the answer we wanted, k = 2. The key ideas were: * "Run" the function using abstract values as inputs. * An abstract value represents a set of (concrete) values. * At a control-flow branch, go both ways. * At a control-flow merge, merge the state.
Notice that this time, we didn't have to divide up states at the start, we just started with no knowledge, which is perfectly general. Also, because we rejoin execution streams, in a loop-free method, we only have to interpret each instruction once, so we have completely solved the path explosion problem. (We haven't tried loops yet, and they are a bit harder--they require us to analyze statements more than once, but only finitely many times.)