Incremental computing

Incremental computing, also known as incremental computation, is a software feature which, whenever a piece of data changes, attempts to save time by only recomputing those outputs which depend on the changed data.^[1]^[2]^[3] When IC is successful, it can be significantly faster than computing new outputs naively. For example, a spreadsheet software package might use incremental computation in its recalculation feature, to update only those cells containing formulas which depend (directly or indirectly) on the changed cells.

Static Versus Dynamic

IC techniques can be broadly separated into two approaches: Static approaches attempt to derive an incremental program ΔP from a conventional program P using, e.g., either automatic program transformations, or manual design. These transformations occur before any input or input changes.

Dynamic approaches record information about executing program P on a particular input (I1) and use this information when the input changes (to I2) in order to update the output (from O1 to O2). The figure shows the relationship between program P, the incremental version of this program ΔP, and a pair of inputs and outputs, I1, O1 and I2, O2.

Explicit versus Implicit

Explicit approaches to incremental computing require the programmer to explicitly specify the algorithms and data structures that will be used to preserve unchanged sub-calculations. Implicit schemes, on the other hand, enable a normal non-incremental program to be evaluated in a manner that preserves unchanged sub-calculations, or to be transformed into a program that does this explicitly.^[4]

Smallest unit of change

An incremental computing system typically has a predefined smallest unit of change that will be individually tracked. If a change is made that is smaller in scope than this smallest unit, the entire containing unit will be deemed to have changed. For example, if just one numerical digit of a seven-digit number in a cell in a spreadsheet is altered, the whole cell will be treated as changed.

For a spreadsheet, the smallest unit is a cell, whereas for make, it is typically an individual file. This means that in make, something can be a dependency of an entire file - but not of elements within files, such as individual functions.

Incremental compilers address the problem of having to recompile an entire compilation unit of a program if any of the source files the unit depends on have changed.

Implementation

A conservative (theoretically sub-optimal) implementation technique for incremental computing is for the software to build a dependency graph of all the data elements that may need to be recalculated, and their dependencies. The elements that need to be updated when a single element changes are given by the transitive closure of the dependency relation of the graph. In other words, if there is a path from the changed element to another element, the latter will be updated (even if the value does not actually change).

The dependency graph may need to be updated as dependencies change, or as elements are added to, or removed from, the system. It is used internally by the implementation, and does not typically need to be displayed to the user.

Partial evaluation can be seen as a method for automating the simplest possible case of incremental computing, in which an attempt is made to divide program data into two categories: that which can vary based on the program's input, and that which cannot (and the smallest unit of change is simply "all the data that can vary"). Of course, partial evaluation can be combined with other incremental computing techniques.

Other implementation techniques exist. For example, a topological evaluation order may be used to avoid the precomputation of elements that need to be reevaluated as in FrTime, which also avoids the problem of unnecessary updates. If a language permits cycles in the dependency graph, a single pass through the graph may not be sufficient to reach a fixed point. In many cases, complete reevaluation of a system is semantically equivalent to incremental evaluation, and may be more efficient in practice if not in theory.^[5]

Existing systems

Compiler and Language Support

Automatic Incrementalization (also called "Self-Adjusting Computation", and "Adaptive Functional Programming"): CEAL, Delta ML, Haskell Adaptive
Cornell Synthesizer Generator
IceDust - a custom domain-specific language.

Frameworks

One-way Dataflow Constraints (Reactive Computation in C++)
Differential Dataflow
Jane Street Incremental
Incremental Datalog (Logicblox)
Incremental Prolog (XSB)
Domain-Specific Approaches:
- Incremental Type Checking

References

^ Carlsson, Magnus (2002). "Monads for incremental computing". Proceedings of the seventh ACM SIGPLAN international conference on Functional programming. New York: ACM. pp. 26–35. doi:10.1145/581478.581482. ISBN 1-58113-487-8. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Umut A. Acar (2005). Self-Adjusting Computation (PDF) (Ph.D. thesis).
^ Camil Demetrescu; Irene Finocchi; Andrea Ribichini (2011). "Reactive Imperative Programming with Dataflow Constraints". Proceedings of the 26th ACM International Conference on Object-Oriented Programming Systems Languages and Applications (OOPSLA 2011). ACM. pp. 407–426. doi:10.1145/2048066.2048100. ISBN 978-1-4503-0940-0. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
^ Yan Chen; Joshua Dunfield; Matthew A. Hammer; Umut A. Acar. ICFP 2011 (PDF).
^ Kimberley Burchett; Gregory H. Cooper; Shriram Krishnamurthi (2007). "Lowering: A static optimization technique for transparent functional reactivity". In ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation. pp. 71–80. ISBN 978-1-59593-620-2. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[monads-for-incremental-1] Carlsson, Magnus (2002). "Monads for incremental computing". Proceedings of the seventh ACM SIGPLAN international conference on Functional programming. New York: ACM. pp. 26–35. doi:10.1145/581478.581482. ISBN 1-58113-487-8. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[2] Umut A. Acar (2005). Self-Adjusting Computation (PDF) (Ph.D. thesis).

[3] Camil Demetrescu; Irene Finocchi; Andrea Ribichini (2011). "Reactive Imperative Programming with Dataflow Constraints". Proceedings of the 26th ACM International Conference on Object-Oriented Programming Systems Languages and Applications (OOPSLA 2011). ACM. pp. 407–426. doi:10.1145/2048066.2048100. ISBN 978-1-4503-0940-0. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[4] Yan Chen; Joshua Dunfield; Matthew A. Hammer; Umut A. Acar. ICFP 2011 (PDF).

[5] Kimberley Burchett; Gregory H. Cooper; Shriram Krishnamurthi (2007). "Lowering: A static optimization technique for transparent functional reactivity". In ACM SIGPLAN Symposium on Partial Evaluation and Semantics-Based Program Manipulation. pp. 71–80. ISBN 978-1-59593-620-2. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)

[1]

[2]

[3]

[4]

[5]