Pointer analysis

In computer science, pointer analysis, or points-to analysis, is a static code analysis technique that establishes which pointers, or heap references, can point to which variables, or storage locations. It is often a component of more complex analyses such as escape analysis. A closely related technique is shape analysis.

(This is the most common colloquial use of the term. A secondary use has pointer analysis be the collective name for both points-to analysis, defined as above, and alias analysis. Points-to and alias analysis are closely related but not always equivalent problems.)

Example

For the following example program, a points-to analysis would compute that the points-to set of p is {x, y}.

int x;
int y;
int* p = unknown() ? &x : &y;

Introduction

As a form of static analysis, fully precise pointer analysis can be shown to be undecidable.^[1] Most approaches are sound, but range widely in performance and precision. For large programs, some tradeoffs may be necessary to make the analysis finish in reasonable time and space. Two examples of these tradeoffs are:^[2]

Treating all references from a structured object as being from the object as a whole. This is known as field insensitivity or structure insensitivity.
Ignoring flow of control when analysing which objects are assigned to pointers. This is known as context-insensitive pointer analysis (when ignoring the context in which function calls are made) or flow-insensitive pointer analysis (when ignoring the control flow within a procedure).

The disadvantage of these simplifications is that the calculated set of objects pointed to may become less precise.

Context-Insensitive, Flow-Insensitive Algorithms

Pointer analysis algorithms are used to convert collected raw pointer usages (assignments of one pointer to another or assigning a pointer to point to another one) to a useful graph of what each pointer can point to.^[3]

Steensgaard's algorithm and Andersen's algorithm are common context-insensitive, flow-insensitive algorithms for pointer analysis. They are often used in compilers, and have implementations in the LLVM codebase.

Flow-Insensitive Approaches

Many approaches to flow-insensitive pointer analysis can be understood as forms of abstract interpretation, where heap allocations are abstrated by their allocation site (i.e., a program location).^[4]

Many flow-insensitive algorithms are specified in Datalog, including those in the Soot analysis framework for Java.^[5]

Context-sensitive, flow-insensitive algorithms achieve higher precision, generally at the cost of some performance, by analyzing each procedure several times, once per context.^[6] Most analyses use a "context-string" approach, where contexts consist of a list of entries (common choices of context entry include call sites, allocation sites, and types).^[7] To ensure termination (and more generally, scalability), such analyses generally use a k-limiting approach, where the context has a fixed maximum size, and the least recently added elements are removed as needed.^[8] Three common variants of context-sensitive, flow-insensitive analysis are:^[9]

Call-site sensitivity
Object sensitivity
Type sensitivity

References

^ Reps, Thomas (2000-01-01). "Undecidability of context-sensitive data-dependence analysis". ACM Transactions on Programming Languages and Systems. 22 (1): 162–186. doi:10.1145/345099.345137. ISSN 0164-0925.
^ Barbara G. Ryder (2003). "Dimensions of Precision in Reference Analysis of Object-Oriented Programming Languages". Compiler Construction, 12th International Conference, CC 2003 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2003 Warsaw, Poland, April 7–11, 2003 Proceedings. pp. 126–137. doi:10.1007/3-540-36579-6_10.
^ Zyrianov, Vlas; Newman, Christian D.; Guarnera, Drew T.; Collard, Michael L.; Maletic, Jonathan I. (2019). "srcPtr: A Framework for Implementing Static Pointer Analysis Approaches" (PDF). ICPC '19: Proceedings of the 27th IEEE International Conference on Program Comprehension. Montreal, Canada: IEEE.
^ Smaragdakis, Yannis; Bravenboer, Martin; Lhoták, Ondrej (2011-01-26). "Pick your contexts well: understanding object-sensitivity". Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages. POPL '11. Austin, Texas, USA: Association for Computing Machinery: 17–30. doi:10.1145/1926385.1926390. ISBN 978-1-4503-0490-0.
^ Antoniadis, Tony; Triantafyllou, Konstantinos; Smaragdakis, Yannis (2017-06-18). "Porting doop to Soufflé: a tale of inter-engine portability for Datalog-based analyses". Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis. SOAP 2017. Barcelona, Spain: Association for Computing Machinery: 25–30. doi:10.1145/3088515.3088522. ISBN 978-1-4503-5072-3.
^ (Smaragdakis & Balatsouras, p. 29)
^ Thiessen, Rei; Lhoták, Ondřej (2017-06-14). "Context transformations for pointer analysis". ACM SIGPLAN Notices. 52 (6): 263–277. doi:10.1145/3140587.3062359. ISSN 0362-1340.
^ (Li et al., pp. 1:4)
^ (Smaragdakis & Balatsouras)

Bibliography

Zyrianov, Vlas; Newman, Christian D.; Guarnera, Drew T.; Collard, Michael L.; Maletic, Jonathan I. (2019). "srcPtr: A Framework for Implementing Static Pointer Analysis Approaches" (PDF). ICPC '19: Proceedings of the 27th IEEE International Conference on Program Comprehension. Montreal, Canada: IEEE.
Smaragdakis, Yannis; Balatsouras, George (2015). "Pointer Analysis" (PDF). Foundations and Trends in Programming Languages. 2 (1): 1–69. doi:10.1561/2500000014. Retrieved May 30, 2019.
Li, Yue; Tan, Tian; Møller, Anders; Smaragdakis, Yannis (2020-05-18). "A Principled Approach to Selective Context Sensitivity for Pointer Analysis". ACM Transactions on Programming Languages and Systems. 42 (2): 10:1–10:40. doi:10.1145/3381915. ISSN 0164-0925.
Michael Hind (2001). "Pointer analysis: haven't we solved this problem yet?" (PDF). PASTE '01: Proceedings of the 2001 ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering. ACM. pp. 54–61. ISBN 1-58113-413-4.
Steensgaard, Bjarne (1996). "Points-to analysis in almost linear time" (PDF). POPL '96: Proceedings of the 23rd ACM SIGPLAN-SIGACT symposium on Principles of programming languages. New York, NY, USA: ACM. pp. 32–41. doi:10.1145/237721.237727. ISBN 0-89791-769-3.
Andersen, Lars Ole (1994). Program Analysis and Specialization for the C Programming Language (PDF) (PhD thesis).

[1] Reps, Thomas (2000-01-01). "Undecidability of context-sensitive data-dependence analysis". ACM Transactions on Programming Languages and Systems. 22 (1): 162–186. doi:10.1145/345099.345137. ISSN 0164-0925.

[2] Barbara G. Ryder (2003). "Dimensions of Precision in Reference Analysis of Object-Oriented Programming Languages". Compiler Construction, 12th International Conference, CC 2003 Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2003 Warsaw, Poland, April 7–11, 2003 Proceedings. pp. 126–137. doi:10.1007/3-540-36579-6_10.

[3] Zyrianov, Vlas; Newman, Christian D.; Guarnera, Drew T.; Collard, Michael L.; Maletic, Jonathan I. (2019). "srcPtr: A Framework for Implementing Static Pointer Analysis Approaches" (PDF). ICPC '19: Proceedings of the 27th IEEE International Conference on Program Comprehension. Montreal, Canada: IEEE.

[4] Smaragdakis, Yannis; Bravenboer, Martin; Lhoták, Ondrej (2011-01-26). "Pick your contexts well: understanding object-sensitivity". Proceedings of the 38th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages. POPL '11. Austin, Texas, USA: Association for Computing Machinery: 17–30. doi:10.1145/1926385.1926390. ISBN 978-1-4503-0490-0.

[5] Antoniadis, Tony; Triantafyllou, Konstantinos; Smaragdakis, Yannis (2017-06-18). "Porting doop to Soufflé: a tale of inter-engine portability for Datalog-based analyses". Proceedings of the 6th ACM SIGPLAN International Workshop on State Of the Art in Program Analysis. SOAP 2017. Barcelona, Spain: Association for Computing Machinery: 25–30. doi:10.1145/3088515.3088522. ISBN 978-1-4503-5072-3.

[6] (Smaragdakis & Balatsouras, p. 29)

[7] Thiessen, Rei; Lhoták, Ondřej (2017-06-14). "Context transformations for pointer analysis". ACM SIGPLAN Notices. 52 (6): 263–277. doi:10.1145/3140587.3062359. ISSN 0362-1340.

[8] (Li et al., pp. 1:4)

[9] (Smaragdakis & Balatsouras)

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

v t e Compiler optimizations
Basic block	Peephole optimization Local value numbering
Loop	Automatic parallelization Automatic vectorization Induction variable Loop fusion Loop-invariant code motion Loop inversion Loop interchange Loop nest optimization Loop splitting Loop unrolling Loop unswitching Software pipelining Strength reduction
Data-flow analysis	Available expression Common subexpression elimination Constant folding Dead store elimination Induction variable recognition and elimination Live-variable analysis Upwards exposed uses Use-define chain Reaching definitions
SSA-based	Global value numbering Sparse conditional constant propagation
Code generation	Instruction scheduling Instruction selection Register allocation Rematerialization
Functional	Deforestation Tail-call elimination
Global	Interprocedural optimization
Other	Bounds-checking elimination Compile-time function execution Dead-code elimination Expression templates Inline expansion Jump threading Partial evaluation Profile-guided optimization
Static analysis	Alias analysis Array-access analysis Control-flow analysis Data-flow analysis Dependence analysis Escape analysis Pointer analysis Shape analysis Value range analysis