Version vector
A version vector is a mechanism for tracking changes to data in a distributed system, where multiple agents might update the data at different times. A version vector works by generating a preorder to keep track of update events that precede, and may therefore influence later updates. It enables causality tracking among data replicas and is a basic mechanism for optimistic replication. The state in a version vector is identical to the state in a vector clock but with different update rules:
- Initially all vector counters are zero.
- Each time a replica experiences an update event, it increments its own counter in the vector by one.
- Each time two replicas synchronize, both resulting vectors are identical and will depict the pairwise maximum between the two vectors across all counters.
Pairs of replicas, , , can be compared by inspecting their version vectors and determined to be either: identical (), concurrent (), or ordered ( or ).
Version vectors[1]or variants are used in many distributed file systems to track updates, see Coda (file system) and Ficus, and are the main data structure behind optimistic replication [2]. For example Internet explorer stores version vectors in the Windows registry.
Other Mechanisms
- Hash Histories [3] avoid the use of counters by keeping a set of hashes of each updated version and comparing those sets by set inclusion. However this mechanism can only give probabilistic guaranties.
- Concise Version Vectors [4] allow significant space savings when handling multiple replicated items, such as in directory structures in filesystems.
- Version Stamps [5] allow tracking of a variable number of replicas and do not resort to counters. This mechanism can depict scalability problems in some settings, but can be replaced by Interval Tree Clocks.
- Interval Tree Clocks[6] generalize version vectors and vector clocks and allows dynamic numbers of replicas/processes.
- Bounded Version Vectors [7] allow a bounded implementation, with bounded size counters, as long as replica pairs can be atomically synchronized.
References
- ^ Douglas Parker, Gerald Popek, Gerard Rudisin, Allen Stoughton, Bruce Walker, Evelyn Walton, Johanna Chow, David Edwards, Stephen Kiser, and Charles Kline. Detection of mutual inconsistency in distributed systems. Transactions on Software Engineering. 1983
- ^ David Ratner, Peter Reiher, and Gerald Popek. Dynamic version vector maintenance. Technical Report CSD-970022, Department of Computer Science, University of California, Los Angeles, 1997
- ^ ByungHoon Kang, Robert Wilensky, and John Kubiatowicz. The Hash History Approach for Reconciling Mutual Inconsistency. ICDCS, pp. 670-677, IEEE Computer Society, 2003.
- ^ Dalia Malkhi and Doug Terry. Concise Version Vectors in WinFS.Distributed Computing, Vol. 20, 2007.
- ^ Paulo Almeida, Carlos Baquero and Victor Fonte. Version Stamps: Decentralized Version Vectors. ICDCS, pp. 544-551, 2002.
- ^ Paulo Almeida, Carlos Baquero and Victor Fonte. Interval Tree Clocks. OPODIS, Lecture Notes in Computer Science, Vol. 5401, pp. 259-274, Springer, 2008.
- ^ José Almeida, Paulo Almeida and Carlos Baquero. Bounded Version Vectors. DISC: International Symposium on Distributed Computing, LNCS, 2004.