SLIDE 1
Smart programming languages, smart program analysis Varmo Vene - - PowerPoint PPT Presentation
Smart programming languages, smart program analysis Varmo Vene - - PowerPoint PPT Presentation
Smart programming languages, smart program analysis Varmo Vene Institute of Cybernetics at TUT & University of Tartu Introduction A quote from classics Everyone knows that debugging is twice as hard as writing a program in the first
SLIDE 2
SLIDE 3
Introduction
Possible reasons
Human imperfection To err is human, to forgive divine. (Alexander Pope, 1688–1744) Laws of nature Program testing can be used to show the presence of bugs, but never to show their absence! (Edsger Dijkstra, 1970) Imperfection of tools The most effective debugging tool is still careful thought, coupled with judiciously placed print state- ments. (Brian Kernighan, 1979)
SLIDE 4
Introduction
A goal of Semantics (among others)
To develop programming tools that give strong guarantees about properties of programs. – Eg. guarantee the absence of certain kind of errors. Proactive tools – Eg. program extraction. Preventive tools – Eg. programming languages with powerful type systems. Retroactive tools – Eg. static program analyzers.
SLIDE 5
Outline
Total Functional Programming – Inductive types – Comonadic recursion – Recursive coalgebra – Mendler-style recursion Goblint – Path-sensitivity – Concurrent analysis Working Group and Plans
SLIDE 6
Total Functional Programming
Total Functional Programming
In total functional programming paradigm all programs are terminating. In particular, there is no general recursion. Instead, only some restricted forms of recursion are allowed, which are guaranteed to terminate. Usually, these are simple iteration or primitive recursion over inductive types. Sometimes also corecursive definitions of coinductive types are allowed. While not Turing complete, most of the interesting programs are in principle expressible in such paradigm.
SLIDE 7
Total Functional Programming
Inductive Types and Iteration
Categorically, inductive types (such as natural numbers, lists, trees, etc) are initial algebras of endofunctors. The most basic form of recursion (known as iteration or fold) corresponds to the unique homomorphism property of initial algebras.
F✖F F❆ ✖F ❆
F❢ ✐♥ ✾✦❢ ❂ ❢♦❧❞✭✬✮ ✽✬
By duality, coinductive types (streams and various other in- finite and potentially infinite structures) are terminal coal- gebras, and the basic form of corecursion (known as coiter- ation) rises from the unique homomorphism property.
SLIDE 8
Total Functional Programming
Comonadic Recursion
In series of papers (Uustalu & Vene, 1996–98) we introduced several new (co)recursion schemes capturing primitive core- cursion, course-of-value (co-)recursion, etc. All of them shared strong similarities, but differed on con- crete details. In (Uustalu & Vene & Pardo, 2000) we proved a generic many-in-one recursion scheme parametrized by a recursive call pattern represented by a comonad with a distributive law. The new scheme covered most of the previously known re- cursion schemes as instances of the comonadic one.
SLIDE 9
Total Functional Programming
Recursive Coalgebras
The algebra structure ✐♥F ✿ F✖F ✦ ✖F is an isomorphism. In fact, the essential properties of a recursion scheme depend more on its inverse, a coalgebra! In (Capretta & Uustalu & Vene, 2004) we defined the notion
- f recursive coalgebras.
F❆ F❇ ❆ ❇
F❢ ☛ ✾✦❢ ✽✬
The notion generalizes well-founded recursion and has it’s
- rigin in (Osius, 1970).
We identified a number of ways for constructing recursive coalgebras and generalized the comonadic recursion to this setting.
SLIDE 10
Total Functional Programming
Mendler-style recursion
Programming with recursors defined by properties such as initiality, comonadic recursion, etc. is cumbersome.
- Eg. functions defined by the iteration must have the follow-
ing form:
F✖F ✖F ❆ F❆
✐♥ ✾✦❢ F❢ ✽✬
SLIDE 11
Total Functional Programming
Mendler-style recursion
Programming with recursors defined by properties such as initiality, comonadic recursion, etc. is cumbersome. In (Uustalu & Vene, 1996, 2000, 2002) we considered an alternative form:
F✖F ✖F ❆
✐♥ ✾✦❢ ✽✟✭❢✮
where ✟ ✿ ✽❳✿✭❳ ✦ ❆✮ ✦ ✭F❳ ✦ ❆✮. Idea originates from (Mendler, 1987). And extends to other recursion schemes.
SLIDE 12
Total Functional Programming
Mendler-style recursion
The scheme looks quite similar to the general recursion, hence is (hopefully) more intuitive. But the termination is still guaranteed.
- Ie. we have termination checking by type-checking.
SLIDE 13
Total Functional Programming
Mendler-style recursion
The scheme looks quite similar to the general recursion, hence is (hopefully) more intuitive. But the termination is still guaranteed.
- Ie. we have termination checking by type-checking.
Ongoing and further works
Corecursive algebras (with V. Capretta) Mendler-style vs. Circular proofs (with R. Cockett) . . . To make Total FP fly!
SLIDE 14
Where we are?
Total Functional Programming – Inductive types – Comonadic recursion – Recursive coalgebra – Mendler-style recursion Goblint – Path-sensitivity – Concurrent analysis Working Group and Plans
SLIDE 15
Goblint
What is Goblint?
Goblint is a static analyzer for Posix-threaded C Focused on detecting multiple access data races Integrates with Eclipse C development environment Aims to be sound (ie. must detect all errors, but may give false alarms) Aims to be efficient enough to be able to analyze medium-to-large scale programs (✕ 100 kLOC) Aims to be precise enough to be able to analyze medium-to-large scale programs (✕ 100 kLOC) (Vojdani & Vene, 2007)
SLIDE 16
Goblint
Main conflicts
Soundness vs. C Efficiency vs. Precision
SLIDE 17
Goblint
Main conflicts
Soundness vs. C Efficiency vs. Precision
Soundness vs. C
Restrict to the ”safe” subset of C: no setjmp and getjmp; no dynamic data structures; no recursion; . . .
SLIDE 18
Goblint
Main conflicts
Soundness vs. C Efficiency vs. Precision
Soundness vs. C
Restrict to the ”safe” subset of C: no setjmp and getjmp; no dynamic data structures; no recursion; . . . Not as bad as it looks: we can still handle these constructs, but do not guarantee the soundness.
SLIDE 19
Goblint
Main conflicts
Soundness vs. C Efficiency vs. Precision
Efficiency vs. Precision
We adopt normal data flow analysis techniques, but use functional approach to distinguish calling contexts, use dynamically adjustable path-sensitive analysis; use global invariant based concurrent analysis.
SLIDE 20
Goblint: Path-sensitivity
man gcc on “-Wuninitialized”
These warnings are made optional because GCC is not smart enough to see all the reasons why the code might be correct despite appearing to have an error . . . Here is another common case: int save_y ; i f ( change_y ) save_y = y , y = new_y ; . . . i f ( change_y ) y = save_y ; This has no bug because "save_y" is used only if it is set.
SLIDE 21
Goblint: Path-sensitivity
Example
int save_y ; i f ( change_y ) save_y = y , y = new_y ; . . . i f ( change_y ) y = save_y ;
What is the problem?
There are 4 potential execution paths. Only 2 are logically possible. We need to distinguish execution paths. In general, there are an infinite number of paths!
SLIDE 22
Goblint: Path-sensitivity
Example
int save_y ; i f ( change_y ) save_y = y , y = new_y ; . . . i f ( change_y ) y = save_y ;
Our solution
We only track the paths that are relevant to the analysis result. In this example, paths are relevant when the set of uninitialized variables are different. In general, relevance depends on the user-analysis. . .
SLIDE 23
Goblint: Concurrent Analysis
State explosion
Precise concurrent analysis leads to state explosion.
- Eg. if there are two threads with 10 instructions each, then
there are 184756 possible interleavings!
Global invariant based concurrent analysis
Separate shared (ie. global) and local variables. Compute a single invariant for global state. Essentially, join all possible values in all program points. Now all threads can be analyzed sequentially. Very imprecise for base domain, but works well with user domains like lock-sets. (Seidl & Vene & Müller Olm, 2003).
SLIDE 24
Goblint
Ongoing and further works
Equality analysis of addresses (with H. Seidl); Scalability improvements; Adding new analyses (eg. variable initialization,
- pen-use-close analysis, etc.);
Better handling of external functions; . . .
Additional information
Goblint has an Open Source license You can download it from web: http://goblin.at.mt.ut.ee/goblint/tracker/
SLIDE 25
Working Group and Plans
Programming Languages and Systems at EXCS
Senior staff Keiko Nakata (IoC) Jaan Penjam (IOC) Härmel Nestra (UT) Tarmo Uustalu (IOC) Hellis Tamm (IOC) Varmo Vene (IOC/UT) PhD students Ando Saabas (IOC) Vesal Vojdani (UT) Jevgeni Kabanov (UT) Andres Toom (IOC) Aivar Annamaa (UT) Martin Pettai (UT) Best friend Peeter Laud (CybAS)
SLIDE 26