A Generic Undo Support for State-Based CRDTs

CRDTs (Conﬂict-free Replicated Data Types) have properties desirable for large-scale distributed systems with variable network latency or transient partitions. With CRDT, data are always available for local updates and data states converge when the replicas have incorporated the same updates. Undo is useful for correcting human mistakes and for restoring system-wide invariant violated due to long delays or network partitions. There is currently no generally applicable undo support for CRDTs. There are at least two reasons for this. First, there is currently no abstraction that we can practically use to capture the relations between undo and normal operations with respect to concurrency and causality. Second, using inverse operations as the existing partial solutions, the CRDT designer has to hard-code certain rules and design a new CRDT for almost every operation that needs undo support. In this paper, we present an approach to generic support of undo for CRDTs. The approach consists of two major parts. We ﬁrst work out an abstraction that captures the semantics of concurrent undo and redo operations through equivalence classes. The abstraction is a natural extension of undo and redo in sequential applications and is straightforward to implement in practice. By using this abstraction, we then device a mechanism to augment existing CRDTs. The mechanism provides an “out of the box” support for undo without the involvement of the CRDT designers. We also present a practical application of the approach in collaborative editing.


Introduction
The CAP theorem ( [11,14]) states that in a networked system, it is impossible to simultaneously ensure all three desirable properties, namely (C) consistency equivalent to a single up-to-date copy of data, (A) availability of that data for update and (P) tolerance to network partition.[7] revisited the theorem and clarified some common misunderstandings.Among these, the three properties are continuous rather than binary and partition is a function of reorder or duplicate messages, but it cannot corrupt messages.Through re-sending, messages will eventually be delivered.The implication is that there can be network partitions, but disconnected sites will eventually get connected.

Notations
N is the set of natural numbers.B is the set of Boolean values.B = {False, True}.P(S) denotes the power set on S. Most sets in this paper are partially ordered and have a least element ⊥ (also known as the bottom element).
Set comprehension is of the form {x ∈ S| pred(x)} or {f (x)|x ∈ S}, where f is a function and pred is a predicate.
We use m : K → V to denote a partial function where dom(m) ⊆ K.A partial function can be represented as a set of pairs { k, m(k) |k ∈ dom(m)}.When k ∈ K ∧ k ∈ dom(m) and V has a bottom ⊥ V , we use m(k) = ⊥ V for convenience.For example, given a partial function p : N → N and dom(p) = ∅, we use p(n) = 0 for any n ∈ N, because ⊥ N = 0. Due to this convenience, we do not need an initialization p(n) = 0 as in the case of a total function.
The notation m{k → v} represents an update of the function m for a new value v associated with the key k.
The notation f(x) is like a function or procedure in a conventional programming language.In this paper, it can be a query, a mutator (an operation) or a predicate.We may write f y (x) for f(x, y) to make the signatures of functions look consistent in different contexts.For example, inc(x) increments a counter x, while inc(x, A), or better inc A (x), increments a counter x at site A.

CRDT Background
A CRDT is a data type specifically designed for data replicated at different sites.A site queries and updates its local replica independently (i.e.without coordination with other sites).The data is always available for update, but the data states at different sites may diverge.From time to time, the sites send their updates asynchronously to other sites with an anti-entropy protocol.To incorporate the updates made at the other sites, a site merges the received updates with its local replica.A CRDT has the property that when all sites have incorporated the same set of updates, the replicas converge.
There are two families of CRDT approaches, namely operation-based and state-based.For an operation-based CRDT [23], a message for an update is an encoding of the operation that made the corresponding update.A site that receives the message runs a special procedure to incorporate the update.To enforce convergence, the operations of an operation-based CRDT should commute, i.e. the executions of the same set of operations in different orders should have the same effect.A CRDT is purely operation-based if the encoding and incorporation of operations are trivial, in the sense that they are independent of the state at which the operation is performed [5].Pure operation-based CRDTs require reliable causal delivery of messages.
For a state-based CRDT, as originally presented in [23], a message for updates is the data state of the replica in its entirety.The site that receives the message incorporates the updates by merging the received state and its local state.When the possible states of the data form a join-semilattice (see §4.1 below), the merge is the join of the two states.Convergence is implied by the join-semilattice.As our work focuses on state-based CRDTs, in the following subsections, we present the main theory underlying this family of CRDTs, including delta-state CRDTs [3], which improve the original state-based CRDTs.We also discuss a typical design of inverse operations in state-based CRDTs and why this is usually not sufficient as a mechanism of undo.

State-based CRDTs
A state-based CRDT is a tuple S, , s 0 , Q, M, , where S is a poset of states under partial order , s 0 ∈ S is an initial state, Q is a set of queries on the states, M is a set of mutators for performing updates on the states, and is a join operation on states.Furthermore, the state poset with is a join-semilattice.In this paper, we use the term operation as a particular instance of state update defined by a mutator.For example, m ∈ M is a mutator, whereas m(s) is a state update, hence an operation.Consequently, for two different states s 1 and s 2 , m(s 1 ) and m(s 2 ) are two different operations.
For a poset P under the partial order , a join operation x y returns the least upper bound of elements x and y in P .The join operation is idempotent, commutative and associative.The poset P is a join-semilattice iff x y exists for any x and y in P [13].Some join-semilattices have a least element ⊥, also known as the bottom element.A power set, under the partial order of set inclusion ⊆ and with set union ∪ as join, is a classic example of a join-semilattice that has a bottom element ⊥ = ∅.For every CRDT discussed in this paper, we assume a bottom state ⊥.
For a state-based CRDT, every state update is an inflation.That is, for any mutator m ∈ M and state s ∈ S, s m(s).When a local state s merges with a received remote state s , the new local state becomes s s .Because local updates are inflations and merges are the results of joins, at each site, state updates are monotonic under .In other words, every new state s n+1 subsumes a previous state s n , i.e. s n s n+1 for any n ≥ 0.
Figure 1 (left) shows GSet, a state-based CRDT for grow-only sets, where (The figure also shows a delta-mutator add δ that will be explained in §4.2.) Obviously, an update through add(e) is an inflation, because s ⊆ {e} ∪ s. Figure 1 (right) shows the Hasse diagram of the states in a GSet.A Hasse diagram shows only the "direct links" between states (known as the cover relation c [13]).
GSet is an example of an anonymous CRDT.A CRDT is anonymous if its operations are not specific to the sites that perform the operations and hence do not refer to site identifiers.Two sites can concurrently perform the operations defined by the same mutator.We say that these two sites perform the same anonymous operations concurrently.For example, when site A performs operation add(a, s 1 ) and site B concurrently performs operation add(a, s 2 ), the sites perform the same anonymous operation add(a).On the other hand, a CRDT is named if a site can only update the part of the state that is specific to that site.Different sites cannot perform the same operation concurrently.Figure 2 (left) shows GCounter, a state-based CRDT for grow-only counters.It uses a partial function (or a key-value map) I → N to simulate a globally replicated counter.The sites update the key-value map similar to a version vector [18].When site i increments the counter using operation inc i , only the value mapped from the key i gets incremented.GCounter is named because operation inc i is specific to site i and only site i can perform it.Figure 2 (right) shows the Hasse diagram of the states in a GCounter.In the figure, 2 A denotes the pair A, 2 , to expose the meaning "value 2 at site A".

Delta-state CRDTs
Using state-based CRDTs, as originally presented, is costly in practice, because states in their entirety are sent as messages.Delta-state CRDTs address this issue [3].They are based on the concept of join-irreducible states.
An element x is join-irreducible in a poset P if it cannot be expressed as a join of other elements in P [13].Formally, x is join-irreducible in P if ∀y, z ∈ P : x = y z ⇒ x = y ∨x = z.We use J (P ) for the set of join-irreducible elements of P .
For a finite join-semilattice, join-irreducible elements are those that have only one link below in the Hasse diagram [13].In Figure 1 and Figure 2, the states in boxes are joinirreducible.The set of join-irreducible states of GSet, J (P(E)), consists of singleton sets.The set of join-irreducible states of GCounter, J (I → N) consists of singleton pair sets.
An important property of join-irreducible elements is that every element in a finite poset can be represented as a join of some join-irreducible elements.More precisely, given a finite poset P , for any x ∈ P , x = {y ∈ J (P )|y x}.
A delta-state CRDT has a delta-mutator m δ for every mutator m of the corresponding state-based CRDT.Instead of returning the new updated state m(s), m δ returns a delta representation consisting only of join-irreducible states.The delta representation has the property m(s) = s m δ (s).For example, in Figure 1, add δ is the delta counterpart of add.While add(e, s) returns the whole new state {e} ∪ s, add δ (e, s) returns only a singleton set {e} (when e was not in s and the mutation is effectively executed).Now, instead of sending the whole state m(s), a site only sends the delta representation m δ (s), which is typically much smaller in size than m(s).If a remote site has already incorporated s, a merge with m δ (s) gives the same result as a merge with m(s).We can thereby regard m δ (s) as m(s) where redundancy in s is eliminated.
Because the delta representation is not an inflation, the anti-entropy protocol must do some extra work to achieve certain degree of causality [3].Otherwise, the replicas will still eventually converge, but the sites may observe states out of causal order.In this paper, we focus on the design aspect of CRDTs and their undo support, and will not discuss the anti-entropy protocols.

Inverse operation as undo
Sometimes we may want to perform an inverse of an earlier update, for example, to remove an element that was earlier added into a set.Because updates in state-based CRDTs must be inflationary ( §4.1), it is relatively easy to design CRDTs for those applications where the data grow in nature, such as grow-only set and grow-only counter.To support operations that make data shrink, such as the inverse operation of inflationary operations, we have to design new CRDTs using some special techniques.For example, we can keep the removed data as a kind of tombstones and let the queries achieve the shrinking effect.[23] and [3] presented different set CRDTs that have both add and remove operations.Figure 3 (left) shows a set CRDT 2P B Set (two-phase set using Boolean flags) that is a variation of u-set in [23] and two-phase set 2PSet in [3].We associate every element added to the set with a Boolean flag indicating whether the element has been removed.More precisely, the states are a partial function E → B. We use pair e, False when element e is added and e, True when element e is removed.We adopt the conventional order of Boolean values False True.Hence, when an element is added and removed, the removal wins.(Note in the definitions of remove and , s(e) = False when e ∈ dom(s).)Using operation remove as an inverse operation of add in 2P B Set has a problem.The remove operation itself does not have an inverse operation.Once an element has been removed, it cannot be added back again.Actually, this problem is common among many CRDTs that provide some kind of inverse operations.
Causal CRDTs [3] such as OR-Set (observed-remove set [6], [16]) address this problem by associating state elements with causal contexts.A causal context is a set of event identifiers (typically a pair of a site identifier and a site-specific sequence number).Using causal contexts, we are able to tell explicitly which additions of an element have been later removed.Because there is no upper bound on causal contexts, we can inverse any given (undo or redo) operation by inflation of associated causal contexts.However, maintaining causal contexts for every element can be costly, even though it is possible to compress causal contexts into vector states, especially under causal consistency.In our first contribution ( §5), we work out an abstraction that allows us to use a single number as the smallest context without upper bound.
In general, inverse operations must be specially designed for the given operations and the design is normally not directly applicable to other operations or CRDTs.In our second contribution ( §6), we present how to support undo in any state-based CRDT through a generic state transformation in the join semilattice space of the CRDT states.

Concurrent Undo and Redo Operations
This section presents our first main contribution.We formally characterize the concurrency and causality of undo and redo operations using equivalence classes.We can then represent the equivalence classes with single numbers called undo lengths.The abstraction presented in this section applies generally beyond the context of CRDTs.

Problem statement
The basic question is: when a site sees a set of undo and redo operations of an original normal operation op, should the site undo or redo op? Example.Site S 1 inserts an element e into a set with operation add 1 , undoes the addition with undo 1 and then redoes it with redo 1 .Site S 2 receives add 1 , undoes it with undo 2 , and then receives and integrates undo 1 and redo 1 .Is element e in the set at site S 2 ?The answer should be "yes", because the concurrent undo 1 and undo 2 operations have the same intention, and redo 1 , whose intention is to redo the effect of add 1 , supersedes both.In a sequential system, such as a single-user editor, undo and redo of the same normal operation happen in turn.We could simply count the length of the undo-redo chain.If the length is an odd number, the original operation is undone, otherwise, it is redone.In the example, site S 1 alone is like a sequential system.The length of the undo-redo chain at site S 1 is two and the addition of e should be redone.
Undo in concurrent applications has been an active research topic for decades, particularly in the area of collaborative editing ( [1,10,29,21,22,24,25,27,28]).However, most of the published work does not account for concurrent undo and redo operations correctly.Some of the latest work also counted the number of undo and redo operations to decide whether an original operation is finally undone or redone, but the result is unsatisfactory.
The approach presented in [25] counts the number of times an operation has been undone or redone.If it is an odd number, the original operation in undone.In the example, the number is 3 at site S 2 , so the addition operation add 1 is incorrectly undone.
The approach reported in [27] counts the numbers of undo and redo operations separately.The undo or redo with the higher number wins.In the example, there are two undos and one redo at site S 2 .Therefore undo wins and add 1 is incorrectly undone.
The root problem with these earlier approaches is that they do not define the semantics undo and redo operations with respect to concurrency and causality of the operations.In the example, the two concurrent undo operations undo 1 and undo 2 are both meant to undo the same operation add 1 .Therefore they should have the same effect as a single undo.On the other hand, redo 1 happens causally after undo 1 (which is effectively the same as undo 2 ) and hence should have the final effect at site S 2 .

Capturing concurrency and causality of undo operations
An application performs operations to modify its data.For example, the add operation adds an element into a set.We call these normal operations.In a distributed system, different sites may perform the same normal operations concurrently (or more specifically, the same anonymous operations described in §4.1).In Figure 4, site A and site B perform the same add(a) operation concurrently.
When the application undoes a normal operation, it cancels the effect of the modification.It can even further undo the undo (to achieve a redo), etc.We use op for a normal operation and o for any operation, which can be either a normal operation or an undo operation.When the application applies an undo operation on an earlier performed operation o, denoted as o = undo(o), we say that o is an undo operation that directly undoes o.An application can only directly undo an operation when it has observed the effect of that operation.
In Figure 4, o In Figure 4, op A ∼ op B because they are the same normal operations add(a),

Lemma (∼ properties)
The ∼ relation is reflexive, symmetric and transitive.Consequently, the tie relation partitions the operations into equivalence groups.For example, the equivalence groups in Figure 4 One requirement on handling concurrent normal or undo operations is that the application should observe the same effect of tied operations.In Figure 4, In Figure 4, orig(op In Figure 4, ). Lemma (origin and undo relations) Undo operations have the same origin iff they are related with tie or undo-supersede relations.Formally, o For two concurrent undo operations o 1 and o 2 that have the same origin, a merge of When a site merges two concurrent operations, if the two operations tie with each other, they should have the same effect and the result of the merge can be either of them.In Figure 4, If one operation undo-supersedes the other, o 1 u o 2 , o 1 has already seen the effect of o 2 and is causally dependent on o 2 .Therefore the result of the merge is o 1 .In Figure 4, B .Now we define the undo length of an operation o as: In Figure 4, ulen(op We could name the equivalence groups under ∼ in such a way that G 0 op contains original normal operations and every operation in G n+1 op directly undoes an operation in G n op .Then for any operation o ∈ G n op , ulen(o) = n.For example, in Figure 4, We can use the undo-redo theorem to answer the question in §5.1.We omit the proofs of the lemmas and theorem in this section as they are trivial, simply by permutation on the different cases or by induction on undo lengths.

Lemma (undo merge)
An application at a site always behaves according to the observation of its latest local state.An undo operation o is a latest undo of a normal operation op at a site, if orig(o) ∼ op and there does not exist o at the site such that o u o.
Locally, an application can only generate a normal operation, directly undo a normal operation if it has not been undone at the site, or directly undo a latest undo operation at the site.In Figure 4

Generically Supporting Undo for CRDTs
This section presents our second main contribution, our approach to generically supporting undo for existing CRDTs using the abstraction presented earlier in §5.

State Deltas as Operations
Every state in a state-based CRDT can be generated from a set of join-irreducible states (see §4.2).In Figures 1-3, states in boxes are join-irreducible.Given a mutator m, the states before and after applying m are s and m(s).Let J s and J m(s) be the sets of join-irreducible states that generate s and m(s).The state delta caused by the execution of m on s is the set of join-irreducible states J m(s) − J s .For example, for the GSet CRDT (Figure 1), the state delta of the operation add(e, s) is (s ∪ {e}) − s = {e} when e ∈ s.When e is already in s, the state delta is an empty set and no operation is actually executed.It is a common and intuitive practice that a state-based CRDT is designed in such a way that every state delta consists of a single join-irreducible state.Or in the case of delta-state CRDTs, every delta-mutator returns a single join-irreducible state.For example, the state delta of add(e, s) of GSet is {e} and the state delta of inc i (s) of GCounter is { i, s(i) + 1 }.We observe that all delta-state CRDTs presented in [3] show this property.With such design, we can use join-irreducible states to represent operations of the CRDT.
In this paper, we assume that the state delta of a normal operation op consists of a single join-irreducible state, written as ↓ δ op.Due to space limit, we do not deal with composite operations consisting of multiple join-irreducible states.

Undo-State CRDT
We maintain the undo states of operations as meta-data using the undo-state CRDT UState (Figure 5).For an existing CRDT with possible join-irreducible states S, the undo state is a partial function u : S → N.For a normal operation op of that CRDT, whose state delta is the join-irreducible state s =↓ δ op, s ∈ dom(u) means the operation op has been performed Figure 5 CRDT for undo states and u(s) is the undo length of a latest undo operation of op (see §5.2 for the respective definitions).Notice that the bottom of N, ⊥ N = 0.If an operation op has not been performed and thus ↓ δ op ∈ dom(u), applying u(↓ δ op) (for example, when performing a join or a query), the result is 0 ( §3).For a normal operation op of the existing CRDT, ↓ δ op = s, the operation reg u (s) of UState registers the new latest undo state of op.The normal operation itself is registered with the addition of a new pair s, 0 into the undo state.A new direct undo of a latest undo operation of op is registered with an incremental of u(s) with one.
A join of two undo states u and u merges the undo lengths of all operations that are registered in either u or u (according to Lemma undo merge in §5.2).
Notice that UState is an anonymous CRDT.To see how this works, remember that we can partition the set of operations into equivalence groups under the tie relation ∼ ( §5.2).Imagine that we register a new undo operation by adding it into the corresponding equivalence group.We can have an anonymous CRDT for the equivalence groups because they are grow-only sets.Using undo lengths in place of equivalence groups is just a way of compressing the undo states.The compression is possible because we are only interested in whether an equivalence group exists, rather than the specific elements in the groups.In addition, a site can only add an element in a new empty group, because it can only directly undo the latest undo operation of that site.
In Figure 4, after site C has incorporated received operation o 2 B , it sees the operations in equivalence groups C , it creates an empty group G 3 add(a) and adds o 2 C into it.Thereby u({a}) in the UState becomes 3.
Another way to look at the undo state is to regard it as a log of the operations that have been performed.For every operation in the log, the recorded information is compressed into a single number, the undo length.
The predicate undone u (s) states that the normal operation whose state delta is s is currently undone (according to Theorem undo-redo in §5.2).
The predicate valid u (s) states that a state s in the existing CRDT is valid in u (i.e.valid u (s) evaluates to True) if the corresponding normal operation has been performed (i.e.
The predicate valid + u takes the dependencies of join-irreducible states into account.When a join-irreducible state becomes invalid due to undo (i.e.valid u evaluates to False), all states depending on it also become invalid (i.e.valid + u evaluates to False).For example, the state 3 A of Gcounter (Figure 2) depends on state 2 A .valid + u (3 A ) = False when valid u ({2 A }) = False.Notice that the states in GSet form an anti-chain.That is, every join-irreducible state is independent of any other join-irreducible state.For such CRDTs, valid + u gives the same result as valid u .
To compute the predicate valid + u , we need to find out the dependencies among joinirreducible states, using the links in the Hasse diagrams (i.e. the cover relation c ).For some CRDTs, the dependencies can be derived.For example, for GCounter, n i c (n + 1) i where n ≥ 0. In case the dependencies cannot be derived, we have to materialize the dependencies, for instance, using a list or tree data structure.

Augmenting Existing CRDTs with Undo
For an existing CRDT T with possible states in S T , the CRDT augmented with undo support T U is a composition of S T and UState(J (S T )). Figure 6 shows the T U CRDT.
The operation do s,u (op) performs a normal operation op in state s and registers op in undo state u.The operation undo_latest s,u (op) directly undoes the latest undo operation of op in state s: it registers the new latest undo in undo state u and has no effect on s.A site can only perform an undo when the original normal operation op has been performed or incorporated (i.e. the state delta ↓ δ op is registered in u).Otherwise, performing an undo has no effect on undo state u.
To join two augmented states s, u and s , u , we join independently the states s and s in S T and the states u and u in UState(J (S T )).
Queries in the original CRDT T are now performed on states transformed from augmented states.ν u (s) defines a transformation that transforms a state s in the original CRDT using the undo state u.To see how it works, remember that the following holds for every state s in S T ( §4.2): The transformation ν u first filters out the invalid join-irreducible states and then joins the valid join-irreducible states to bring back the up-to-date state that reflects the undone effects.
The state transformation can be very costly if applied for every query.To address this, every site maintains a buffer of the transformed state.Every time the site updates the undo state, it also updates the buffered state.For example, when state {a} of GSet becomes invalid, we remove element a from the buffered state.Indeed, the buffered states do not form a join-semilattice.This, however, does not lead to inconsistencies, because the buffered states are only local to the sites and are not propagated to remote sites.Now we use some examples to illustrate how the augmentation works.We first augment GSet (Figure 1) to GSet U for undo support.add u (e, s) performs do s,u (add(e)).The query in u in GSet U is equivalent to the following: The query now takes the undo effect into account.In Figure 4 Again, the results of queries on s 2 u are as expected, in(a, s 2 u ) = False and in(b, s 1 u ) = True.That is, b is in the set but a is not (as if a had never been added).
The last examples with 2P B Set indicate that the generic undo support works well with CRDTs that themselves support inverse operations.

Collaborative Editing with Undo Support
In this section, we show a practical application of the undo support for collaborative editing.The collaborative editing system is based on the CRDT reported in [29].It consists of several peers, each of which has a replica of the shared document under editing.At each peer, a user edits the local copy of the document via the document view, which is simply a string of characters.Under the hood, there is a document model, which is a CRDT of characters.The view is the concatenation of visible characters in the model.We could regard the view O P O D I S 2 0 1 9 2:14 Generic Undo Support for CRDTs as the buffer of transformed state discussed in §6.3.
Figure 7 shows the (simplified) CRDT of the document model.The CRDT is a function from the set of characters C to the power set of site identifiers I.
The characters of the CRDT have globally unique and ordered identifiers that are specific to the sites that inserted the character ( [26,4]).Therefore the Doc CRDT is named-every character is unique and cannot be concurrently inserted at different peers.However, different peers can concurrently delete the same character.
When a character c is inserted, c maps to an empty set.When site i deletes c, i is added to the set that c maps to.A character c is visible in the document if it is inserted but not deleted, that is, when c is in the domain and maps to the empty set.
To support undo, we simply augment Doc to Doc U .The designer of the Doc CRDT does not need to manually design anything in addition.In the augmented CRDT, visible u is equivalent to the following: A character c is visible in the document if it is inserted and the insertion is not undone, and if it is deleted, all deletions are undone.
To see why a character should be visible only when all deletions are undone, consider the situation where site A deletes a character "x" and then undoes the deletion.Meanwhile, site B also deletes "x".The final effect should be as if site A had done nothing and site B performed a deletion.So character "x" should not appear in the document.Some researchers (for example [22]) regard concurrent deletions of the same character as the same operation (which may lead to some confusing semantics of the undo of string-wise operations [29]).We could achieve this by using partial function C → B (similar to the 2P B Set CRDT in Figure 3) rather than C → P(I).With this re-design, a character is visible when only one deletion is undone (because all deletions of the same character are regarded as the same anonymous operation).Related Work Supporting inverse operations was already a topic when CRDTs were first presented [23], such as a counter that can be both incremented and decremented, a set where elements can be both added and removed, etc.The CRDT designer has to design new customized CRDTs in order to support inverse operations.A common problem is that the designer has to decide a "winner" between an operation and its inverse counterpart, for example, a removal always wins (see §4.3 for an example).Furthermore, a "loser" has never got a chance to "win back".
A causal CRDT [3] associates causal contexts with every operation (or element) to achieve the effect such as adding a removed element back to a set.The CRDT designer has to write a new causal CRDT for a given CRDT to get this support.Furthermore, maintaining causal contexts for every operation could be costly when the number of replicas is large.
Our work provides a generic support of undo for any (to our knowledge) state-based CRDT.The CRDT designer does not need to write a new specialize CRDT to get the undo feature.Furthermore, the undo state for an operation is only a single number.
Undo has been a research topic in the area of collaborative editing for decades ([10, 21, 22, 24, 25, 27]).Most of the work was not able to define the semantics of undo and redo operations with respect to concurrency and causality, and therefore showed incorrect behavior as discussed in §5.1.
The abstraction we proposed, although seemingly simple, correctly captures the semantics of concurrent undo an redo operations and does not have the aforementioned issues.
The Doc CRDT ( §7) is a simplification of the work presented in [29].The model CRDT in [29] represents undo relations using equivalence classes ( §5.2) rather than the more compact undo lengths.This allows the editor to support additional features such as displaying who performed a particular undo operation.
The system presented in [9] supports cascading undo of selected operations by explicitly defining dependencies among operations using a process specification language.In our work, operation dependencies are implied by the state order of join-semilattices.

Conclusion
In this work, we have presented how to provide undo features to existing CRDTs.Our work consists of two major parts.
The first part is an abstraction that captures the semantics concurrent undo and redo operations using equivalence classes.The abstraction can be compacted into single numbers (undo lengths) that are straightforward to implement in practice.The abstraction is generally applicable (not restricted to CRDTs) to any system that demands concurrent undo and redo of earlier performed operations.
The second part is a generic approach to augmenting existing state-based CRDTs with the capability of undo.The augmentation transforms the states in an original CRDT to the ones with the undo effects.Unmodified queries can be applied to the transformed states.The states of the augmented CRDTs converge eventually, because the state transformation is local to the replicas and does not propagate to the global system.
We have shown a practical application of our work in collaborative editing.
Operation-based CRDTs have also found their ways in applications that demand undo support.Supporting undo features for operation-based CRDTs is an open research topic.

Figure 1
Figure 1 GSet CRDT and Hasse diagram of states

Figure 2
Figure 2 GCounter CRDT and Hasse diagram of states

=Figure 3
Figure 3 2P B Set CRDT and Hasse diagram of states

Figure 3 (
right) shows the Hasse diagram of the states in 2P B Set.For example, when state { a, True } (i.e.element a has been removed) merges with state { a, False , b, False } (i.e. both elements a and b are in the set), the new state is { a, True , b, False } (i.e.only element b is in the set).

Figure 4 A
Figure 4 A scenario of concurrent undo operations The tie relation ∼ captures the concurrency of undo operations.The following undosupersede relation captures the causality of undo operations.An operation o undo-supersedes operation o , denoted as o u o , if one of the following holds: (i) o = undo(o ), (ii) o = undo(o ) and o ∼ o , (iii) o = undo(o ) and o u o .

B ) and o 2 B u o 1
B .Lemma ( u properties) The u relation is irreflexive, asymmetric and transitive.For an operation o, its original operation, denoted as orig(o), is a normal operation op, such that either (i) o = op, or (ii) o = undo(op), or (iii) o = undo(o ) and orig(o ) = op.

and G 3 In Figure 4 , o 1 A , o 1 B , o 1 C and o 2 C 2 A and o 2 B
add(a) = {o 2 C }.In applications like editors, people often use the terms undo or redo with respect to the original normal operations.When orig(o) ∼ op, we say that o undoes op if either (i) o = undo(op), or (ii) o = undo(undo(o )) and o undoes op; o redoes op if o = undo(o ) and o undoes op.undo add(a) (either op A or op B ), whereas o redo add(a).Obviously, if o undoes op, then undo(o) redoes op.Similarly, if o redoes op, then undo(o) undoes op.-redo) Given orig(o) ∼ op, o undoes op iff ulen(o) is a positive odd number; o redoes op iff ulen(o) is a positive even number.

Figure 6
Figure 6 CRDT augmented with undo

Figure 7
Figure 7 CRDT for a collaborative text editor We relate the (normal or undo) operations with the same intention through the tie relation.An operation o 1 ties with operation o 2 , denoted as o 1 ∼ o 2 , if one of the following holds: (i) o 1 = o 2 , (ii) o 1 and o 2 are the same normal (anonymous) operations, (iii) 1 A directly undoes op A , o 1 B and o 1 C directly undo op B , o 2 , when site B has received o 1 A , the latest undo operations of op A (or equally op B ) are o 1 A and o 1 B .Thus site B can only directly undo o 1 A or o 1 B .It does not matter which of them to undo, because o 1 A ∼ o 1 B .To incorporate the effect of a remote undo operation o, a site merges o with a latest operation o l that has the same origin with o.If the remote operation o undo-supersedes the local operation o l , the result of the merge is o and the site incorporates the effects of o; otherwise the result is o l that the site has already incorporated.