Persistent array

In computer science, and more precisely regarding data structures, a persistent array is a persistent data structure with properties similar to a (non-persistent) array. That is, after a value's update in a persistent array, there exist two persistent arrays: one persistent array in which the update is taken into account, and one which is equal to the array before the update.

Difference between persistent arrays and arrays

An array $\mathrm {ar} =[e_{0},\dots ,e_{n-1}]$ is a data structure, with a fixed number n of elements $e_{0},\dots ,e_{n-1}$ . It is expected that, given the array ar and an index $0\leq i<n$ , the value $e_{i}$ can be retrieved quickly. This operation is called a lookup. Furthermore, given the array ar, an index $0\leq i<n$ and a new value v, a new array ar2 with content $[e_{0},\dots ,e_{i-1},v,e_{i+1},\dots ,e_{n-1}]$ can be created quickly. This operation is called an update. The main difference between persistent and non-persistent arrays being that, in non-persistent arrays, the array ar is destroyed during the creation of ar2.

For example, consider the following pseudocode.

array = [0, 0, 0]
updated_array = array.update(0, 8)
other_array = array.update(1, 3)
last_array = updated_array.update(2, 5)

At the end of execution, the value of array is still [0, 0, 0], the value of updated_array is [8, 0, 0], the value of other_array is [0, 3, 0], and the value of last_array is [8, 0, 5].

There exist two kinds of persistent arrays. A persistent array may be either partially or fully persistent. A fully persistent array may be updated an arbitrary number of times while a partially persistent array may be updated at most once. In our previous example, if array were only partially persistent, the creation of other_array would be forbidden; however, the creation of last_array would still be valid. Indeed, updated_array is an array distinct from array and has never been updated before the creation of last_array.

Lower Bound on Persistent Array Lookup Time

Given that non-persistent arrays support both updates and lookups in constant time, it is natural to ask whether the same is possible with persistent arrays. The following theorem shows that under mild assumptions about the space complexity of the array, lookups must take $\Omega (\log \log n)$ time in the worst case, regardless of update time.

Theorem^[1]—Consider a partially persistent array with $n$ elements and $m=n^{\gamma }$ modifications, where $\gamma$ is a constant fulfilling $1<\gamma \leq 2$ . Assuming the space complexity of the array is $O(m\log ^{k}m)$ for a constant $k$ , the lower bound on the lookup complexity in this partially persistent array is $\Omega (\log \log n)$ .

Implementations

Log-time

The simplest implementation of a fully persistent array of size n uses an arbitrary persistent map, whose keys are the numbers from 0 to n − 1. A persistent map may be implemented using a persistent balanced tree, in which case both updates and lookups would take $O(\log n)$ time.

Log-log-time

In 1989, Dietz^[2] gave an implementation of fully persistent arrays such that lookups and updates can be done in $O(\log \log \min(n,m))$ -time, and space $O(m+n)$ , where m is the number of updates. By the lower bound from the previous section, this is the best possible time complexity for lookup.

This implementation is far more complex than the one mentioned above, and thus won't be described in this article.

References

^ Straka e, Milan (2013). Functional Data Structures and Algorithms. Prague. pp. 67–69.{{cite book}}: CS1 maint: location missing publisher (link)
^ Dietz, Paul F. (1989). "Fully persistent arrays". Proceedings of the Algorithms and Data Structures. pp. 67–74. doi:10.1007/3-540-51542-9_8.

.

[Straka-1] Straka e, Milan (2013). Functional Data Structures and Algorithms. Prague. pp. 67–69.{{cite book}}: CS1 maint: location missing publisher (link)

[Dietz-2] Dietz, Paul F. (1989). "Fully persistent arrays". Proceedings of the Algorithms and Data Structures. pp. 67–74. doi:10.1007/3-540-51542-9_8.

[1]

[2]