Group-Walking Automata

. In the setting of symbolic dynamics on discrete ﬁnitely generated inﬁnite groups, we deﬁne a model of multi-headed ﬁnite automata that walk on Cayley graphs, and use it to deﬁne subshifts. We characterize the torsion groups (also known as periodic groups) as those on which the group-walking automata are strictly weaker than Turing machines.


Introduction
One of the central objects in symbolic dynamics is the dynamical system S G (where G is a discrete group and S a finite alphabet), called the full shift, where G acts by translations.In particular, one studies its subsystems, usually called subshifts, and classes of such subsystems.Some of the important classes studied are the SFTs (subshifts defined by a finite set of forbidden patterns), sofic shifts (the factors of SFTs) and the effective, or Π 0 1 subshifts (defined by a recursively enumerable set of forbidden patterns).SFTs and sofic shifts are natural objects to study on all groups, and a robust notion of effectiveness of subshifts on arbitrary groups is given in [2] (see also Section 5).
In this paper, continuing the work in [9], we define some new families of subshifts on an arbitrary (discrete finitely generated infinite) group G. Namely, we discuss the class of subshifts defined by certain multi-headed automata that walk on the Cayley graph of the group G.We have studied the case G = Z d in [9], the main result being that three-headed finite-state automata define the same subshifts as general Turing machines. 3It turns out that up to notational complications and a few simple tricks, the same result can be shown on all groups containing a copy of Z.We show this in Theorem 1.
Most finitely generated groups of practical interest contain a copy of Z.For example, in addition to infinite (finitely generated) abelian groups, this is true for free groups, Baumslag-Solitar groups, the Heisenberg group, the Thompson groups F , T and V , and the general linear groups GL(n, Z).In fact, infinite finitely generated groups without a copy of Z, known as torsion groups, are quite rare and hard to construct.Nevertheless, many examples exist in the literature.The question is particularly hard in the case that the torsion is bounded, that is, there exists n ∈ N such that every element of the group generates a subgroup of order at most n.See [1] for a discussion of groups with bounded torsion.In the case of unbounded torsion, there are examples that are relatively simple to define, and simple to prove torsion.We mention in particular [5,6].
Given that such groups exist, an obvious question is whether we can extend Theorem 1 to this case.It turns out that we cannot: in Theorem 2 we show that a subshift on a torsion group accepted by a multi-headed automaton 'cannot be too sparse', and as a further result we obtain Theorem 6, which characterizes the torsion groups as those on which multi-headed automata are strictly weaker than Turing machines.

Subshifts
In this section, we define some basic notions of of symbolic dynamics and computability.Some references on symbolic dynamics on general groups are [3,2], and a standard reference on Z is [8].
Let G be a group with identity element 1 G ∈ G. Our groups are always infinite (the finite case being trivial) and finitely generated (since the notions we consider are local).For convenience, if G is finitely generated, we fix a symmetric finite set s(G) ⊂ G of generators for it.The set s(G) * consists of all finite words over s(G), and for v, w ∈ s(G) * , we denote v ∼ w and v ∼ g if the words correspond to the same element g ∈ G.We denote by B G (n) the ball of radius n with respect to the fixed set of generators: A torsion element of a group G is an element g ∈ G that satisfies g n = 1 G for some n ≥ 1.If all elements of G are torsions, then G is a torsion group.To each torsion element g ∈ G we associate its order t G (g) = min{n ≥ 1 | g n = 1 G }, and to each finitely generated torsion group we associate the torsion function A non-torsion group, conversely, is one containing an isomorphic copy of Z.
Both alphabet and state set mean any finite set.The symbol S always means an alphabet, and the set S G is the full G-shift over S. Its elements, usually denoted by x, y, z, are called configurations.We define both a left and a right action of G on S G , called the left and right shifts.The left action is given by (g We give S G the product topology induced by the discrete topology on S, making it a compact metrizable space.It is easy to show that both actions are continuous in this topology.A subshift of S G is a topologically closed subset of S G closed under the left action of G.A cellular automaton on a subshift X ⊂ S G is a continuous map f : X → X that commutes with the left shifts in the sense that g • f (x) = f (g • x) holds for all x ∈ S G and g ∈ G.We denote by Aut(X) the group of bijective cellular automata on X under composition.
For us, the main importance of the left action is that it allows for nice definitions of subshifts and cellular automata.On the other hand, when the right action of an element g ∈ G is well-defined on a subshift, it is a cellular automaton.In particular, the right actions show that Aut(S G ) contains a copy of the group G, by the injective group homomorphism g → σ R Subshifts can be characterized as sets X ⊂ S G for which there exists a set of forbidden patterns F such that Each cellular automaton on X has a radius r ∈ N and a local rule ) holds for all x ∈ X and g ∈ G.
Example 1.Let G be the free group generated by g, h ∈ G, and X ⊂ S G the set We show that X is a subshift, and for that, let x ∈ X and g ∈ G.We need to show g • x ∈ X.Given f ∈ G and n ∈ Z, we have by the definition of the left action, and the fact x ∈ X.
Definition 1.If S 0 is a finite alphabet, then the one-S subshift on a group G is the subshift X G S ⊂ S G where a finite pattern P ∈ S D is forbidden if and only if there exist d = e ∈ D with P e = 0 = P d .If 0 / ∈ S, we write The group G is usually clear from context, and we write X S for X G S .
The one-S subshift X S is of course a 1-sparse subshift on any group.Note that in a sparse subshift, there is a global bound on the number of nonzero symbols.The sum x + y of sparse configurations x, y ∈ S G with disjoint support (no g ∈ G satisfies x g = 0 and y g = 0) is defined by A finite pattern is represented computationally as a finite list of word-symbol pairs (w, d) ∈ s(G) * × S. Such a list is inconsistent if it contains two pairs (v, d) and (w, e) with v ∼ w and d = e (in this case, it does not actually encode a pattern), and otherwise consistent.
of words that represent the identity element of G. Whether the word problem is decidable is independent of the chosen generator set.We say G is recursively presented if G ∼ = g 1 , . . ., g k | w 1 , w 2 , . . ., where (w i ) i∈N is a computable sequence of relations. 4This is equivalent to the set E being recursively enumerable.
If G has a decidable word problem, we say that a subshift X ⊂ S G is Π 0 1 if there exists a Turing machine that enumerates a list of consistent forbidden patterns defining it.
A subshift X is Π 0 1 if and only if there exists an oracle Turing machine that, given an oracle for a configuration x ∈ S G (which returns the symbol x w ∈ S for a given word w ∈ s(G) * ), eventually halts if and only if x / ∈ X.

Automata
We now define group-walking automata and the subshifts they recognize.Here and henceforth, by π i we mean the projection to the ith coordinate of a finite Cartesian product.
We denote by S(G, k) the class of subshifts X ⊂ S G for which there exists a k-headed automaton A as above such that We also write S(G) = k≥1 S(G, k).
The intuition for these definitions is the following.A configuration y ∈ Y = k i=1 X Qi consists of k layers π i (y), each of which contains at most one nonzero symbol q i ∈ Q i , representing the i'th head of the automaton in state q i .The cellular automaton f is the update function of the heads: since f has a finite radius, the heads can only move at a bounded speed, and interact over bounded distances.Also, the condition π 1 • f = π 1 ensures that the automaton cannot alter the configuration of S G that it runs on.The clopen sets I, F ⊂ Y are the initial and final states of the automaton.Each of them is a finite union of cylinder sets [P ], and since they are also finite as sets, each of the patterns P necessarily contains all k heads of the automaton.Thus, an initial or finite state specifies the position and internal state for each head, and we translate them by every element of G in the definition of S(G).
The definition is given in dynamical terms to make the connection with cellular automata clearer, and to facilitate the statement and proof of Lemma 3.
With some work, one can show that this model is equivalent to the one we gave in [9] in terms of the classes of subshifts defined.
Example 2. Let G be again the free group generated by the elements g, h ∈ G, and let S = {0, 1}.We define a two-headed group-walking automaton This means that the heads of the automaton are initialized at the same coordinate in states q g and q h , and a configuration is rejected if they ever return to the same coordinate in states q g −1 and q h −1 .The CA f moves each head by the step indicated in its state, and if a head encounters a symbol 1 in state q g or q h , it assumes the respective inverse state q g −1 or q h −1 .
In a run of the automaton, the heads start moving in the directions g and h until they encounter symbols 1, and then turn back.If both of them turn at the same time, they will meet again where they started, in the states q g −1 and q h −1 , so the configuration is rejected.If not, the configuration is not rejected.Thus the automaton A defines the subshift X ⊂ S G with the forbidden patterns Naturally, Turing machines are stronger than multi-headed finite automata.
Lemma 1.If G has a decidable word problem and X ∈ S(G), then X ∈ Π 0 1 .
Proof.Let A be a group-walking automaton that defines X.We construct a Turing machine T A that outputs its forbidden patterns.The machine T A enumerates all consistent patterns over G (using the fact that G has a decidable word problem), and simulates a run of the automaton A on each of them, from every initial state.If one of the heads exits the pattern during such a simulation, or every head enters an infinite loop, that simulation is simply discarded.If one of the runs enters a rejecting state on the pattern P before exiting it (from any initial configuration and initial position on the domain D(P )), the machine T A outputs the pattern P .It is clear that T A defines the same subshift as A.
3 Non-torsion groups with a decidable word problem On non-torsion groups, there are essentially no restrictions on the types of computation a multi-headed finite state automaton can do, apart from the inherent limits of computation.In fact, we will implement all Π 0 1 -subshifts on such groups, using just three heads.The construction is similar to that in [4] and [9].Theorem 1.If G is finitely generated, infinite and non-torsion, and has a decidable word problem, then S(G, 3) is exactly the class of Π 0 1 -subshifts.
Proof.By Lemma 1, all S(G, 3)-subshifts are Π 0 1 .To show that S(G, 3) contains all Π 0 1 -subshifts, we repeat the proof of Theorem 5 in [9], where the same problem was considered for G = Z d , with one additional detail in the non-abelian case.Since there are not many changes, we refer to [9] for some of the details.
Let X ⊂ S G be a Π 0 1 -subshift, and let h ∈ G be element of infinite order.Given a Turing machine T enumerating a list of forbidden patterns for X, we construct an automaton A T with three heads, the pointer head, the zig-zag head and the counter head.The relative positions of these heads store a number, which we increment, decrement, multiply and divide by suitable constants, and test for equivalence and divisibility by constants, in order to perform arbitrary computation: such a model is Turing-complete by the results of [10].
More precisely, all heads are initialized on the same element of G, which we may assume to be 1 G .The run of the automaton proceeds in sweeps, each of which either corresponds to an arithmetical operation as described above, or moves the heads in some direction.Between these sweeps, the location of both the pointer head and the zig-zag head is some g ∈ G, and the position of the counter head is gh p .The number p ∈ N is the counter value.Changes in the counter value are used to perform computation, and changes to the value g allow us to read the contents of every cell in the configuration.
The operations are implemented as in the case G = Z d (for example, see Proposition 3 in [9]).The only operations that are nontrivial to implement are multiplication and division, and they are dealt with by standard signaling techniques.The details of this are omitted in [9], so we outline the construction here: we explain how to multiply the counter value by a rational number 0 < m n < 1 assuming the counter value is divisible by n; to multiply by a rational number greater than 1, one essentially performs the same steps in reverse.
For this, let g ∈ G be the position of the pointer head.The idea is that the zig-zag head moves to the counter head, which is at gh p , along the progression g, gh, gh 2 , . ... The two heads then perform a coordinated move along the path g, gh, gh 2 , . . ., gh c , so that they meet exactly at gh m n p .The zig-zag head then returns to the pointer head, and computation continues.We have much freedom in performing these moves, but we fix a particular scheme that works: After the zig-zag head and the counter head meet, the counter head starts moving in steps of h towards the pointer head (so that from the cell gh j , it moves to the cell gh j−1 in one step), until it meets the zig-zag head again.The zig-zag head moves towards the pointer head by h n every step, until it meets the pointer head.Note that n divides p, so that the zig-zag head indeed reaches exactly the cell g.After this, the zig-zag head starts moving back towards the counter head at speed m n−m−1 .More precisely, the zig-zag head carries a modular counter, starting at 0, and at each step it increments this counter.When the modular counter reaches n − m − 1, the zig-zag head resets it to 0 and moves by h m .When the zig-zag head reaches the counter head, it turns back, and returns to the pointer head.It is a simple calculation to check that the heads meet exactly at gh m n p , as required, so the counter value has been changed correctly.Now that we can do arbitrary computation in the counter value, we give the algorithm we simulate in it.The algorithm is the same as in the proof of Theorem 5 of in [9], and we reproduce it in Algorithm 1 with trivial modifications.In the algorithm, objects related to the group are stored as they are output by the Turing machine: group elements are finite words over s(G), and patterns P ∈ S D are lists of pairs (w, s) ∈ s(G) * × S meaning P w = s.We assume the Turing machine T outputs an infinite list of forbidden patterns, and enters the state q out every time it outputs a new pattern.
Algorithm 1 The algorithm that the three-headed automaton A T simulates.
The position of the pointer head relative to the initial position 3: P : ∅ → S A finite pattern at the initial position 4: loop 5: repeat 6: c ← NextConfT (c) Simulate one step of T 7: until State(c) = qout T outputs something 8: P ← OutputOf(c) A forbidden pattern 9: while D(P ) ⊂ D(P ) do 10: w ← LexMin(D(P ) \ D(P )) The lexicographically minimal element 11: The function ReadSymbol gives the symbol currently under the pointer head.The procedure MoveBy(a) causes the three heads to assume new positions: if the pointer head and zig-zag head are at g and the counter head is at gh p , they are moved to ga and gah p , respectively.This step is the main difference between the abelian and non-abelian cases, and we explain it below.We note that there are only finitely many different messages sent between the abstract computation and A T , namely the exchange related to ReadSymbol and the commands MoveBy(a) for finitely many a ∈ G.This information exchange can easily be performed by storing the state of the Turing machine T directly in the finite state of the pointer head.
It is easy to see that this algorithm does what we want: whenever the Turing machine T enumerates a forbidden pattern P , we expand the stored pattern P by reading the configuration until its domain contains that of P .If P occurs in the configuration, it is eventually found by the algorithm from some starting position, and conversely, if the automaton halts, this is because it found a forbidden pattern.
To finish the proof, we explain how to perform MoveBy(a).If G is abelian, this can be done as in [9]: the zig-zag head moves to the counter head, informs it of the element of G by which it should move, and returns back.The counter head moves as instructed, and the pointer head does so as well.If the pointer head was previously at g and the counter head at gh P , and both move by a ∈ G, then after this sequence of moves, the pointer head will be at ga, and the counter head at gh p a = gah p , as required.More generally, this works if h is in the center of G. Otherwise, we may have gh p a = gah p .Since we do not necessarily have gah p ∈ g h a, the counter head may not even encode a valid counter value.
However, using the same trick we used to perform multiplications, we can perform the movement in general.First, the zig-zag head moves to the counter head.Then, both heads start moving toward the pointer head.The counter head moves in steps of h −1 , computing the parity of p on the way, and the zig-zag head moves in steps of h −2 .If p is even, then the zig-zag head reaches the anchor head exactly, moves to ga, and starts moving along the sequence ga, gah, gah 2 , . . . in steps of h.If p is odd, then the zig-zag head reaches the cell gh −1 instead, moves to gah, and starts moving in steps of h as before.The counter head performs the same task, but with the speeds reversed: after reaching the anchor head with speed h −1 , it starts moving from ga in steps of h 2 if p was even, and from gah in steps of h 2 if p was odd.When the counter head reaches the pointer head, the pointer head also moves to ga.It is easy to check that the counter head and the zig-zag head meet at the cell gah p .The counter head stops, and the zig-zag head returns to the pointer head.

Walking on torsion groups
A torsion group is one where every element generates a finite subgroup.In this section, we show that on such groups, non-trivial sparse subshifts cannot be recognized by multi-headed automata.We also show two results about cellular automata and automorphism groups of sparse subshifts on torsion groups.These follow from a curious property, Lemma 3, of CA on sparse subshifts on torsion groups.In its proof, we use the following lemma about finite metric spaces.Lemma 2. Let X be a finite metric space with |X| = k ≥ 2. For all c < diam(X)/(k − 1), there exists a nontrivial partition Proof.For a set E ⊂ X, write B E (r) for the closed ball of radius r ≥ 0 around E. Let diam(X) = d(y, z) for some y, z ∈ X.Let X 1 = {y}, and inductively define X i+1 = B Xi (c).For all i ≥ 1 we have either |X i+1 | > |X i | or X i+1 = X i , and in the latter case we have X j = X i for all j ≥ i.It follows that X i = X i+1 holds for some i ≤ k.
If we have X i = X i+1 = X, then diam(X) ≤ (k − 1)c, since every element of X, including z, is in the ball B y ((i − 1)c) ⊂ B y ((k − 1)c).This is a contradiction, so it must be the case that X i = X i+1 = X.Then Y = X i and Z = X \ X i give the desired partition.
Lemma 3.For all torsion groups G, there exists a function d : N 3 → N with the following property.For all k-sparse subshifts X ⊂ S G over all alphabets S 0 with |S| = q + 1, all cellular automata f : X → X with radius r ∈ N, and all x ∈ X, we have Proof.We prove the existence of such a function d by induction.We define the function so that it is monotone in all the three parameters.Let t G be the order function and T G the torsion function of G.
First, let k = 1, and let f : X → X be a CA.It is easy to show that if x gh = 0 for all h ∈ B G (r), then f (x) g = 0. Intuitively, this means that nonzero symbols can 'spread' by at most r per time step, and one cannot appear from nowhere.Since X is a k-sparse subshift and k = 1, every point x ∈ X contains at most one nonzero coordinate x g = 0. Intuitively, we want to give an upper bound on how far the nonzero symbol can travel from its initial position g.
By shift-commutation, it is enough to analyze the case x 1 G = 0. Combining the previous observations and the fact |S| = q + 1, it follows from the pigeonhole principle that f n+m (x) = σ R h (f n (x)) for some 0 ≤ n < n + m ≤ q + 1 and h ∈ B G ((q + 1)r).Since f commutes with the shift, we have We have shown that f j (x) h = 0 for some j ∈ N implies h ∈ B G ((q + 1)r(1 + t G (h))).Since h ∈ B G ((q + 1)r), we can define d(1, q, r) = (q + 1)r(1 + T G ((q + 1)r)).
Next, consider the case k > 1.To each configuration x ∈ X, we associate the metric space A(x) whose points are the nonzero coordinates of x, and whose distances are those induced by the natural (right) distance in G.We will split the analysis of the dynamics of f on the point x into two cases, depending on whether the diameter of A(f n (x)) stays bounded (by an explicit constant) as n grows.
Intuitively, the idea is that as long as the diameter stays small, we can shrink all the information in x into a single symbol, reducing to the case d(1, •, •), and if the configuration starts expanding, then it splits into two pieces that can never again communicate, and we apply induction to these smaller pieces.

More precisely, define
Define also c = 2d(k − 1, q, r) + r, and note that since d is monotone, we in particular have c ≥ max 1≤ <k d( , q, r) + d(k − , q, r) + r.
We say that a configuration x ∈ X is clustered if diam(A(x)) ≤ (k − 1)c holds, and scattered otherwise.
Case 2: clustered configurations First, suppose x ∈ X and N ∈ N are such that f n (x) is clustered for all n ≤ N .We will give an upper bound on how far nonzero symbols can travel from their original positions in these N steps.Let Z ⊂ X be the subshift generated by the configurations f n (x) for n ≤ N .It is easy to see that every configuration of Z is clustered.Note that that the subshift Z may not be closed under f .Let Y = X {0}∪K , where Clearly, Y is a 1-sparse subshift, and it should be thought of as a 'compressed' version of Z, where all the nonzero symbols have been encoded into a single coordinate.The idea is to simulate CA f on the compressed subshift Y , and reduce back to the k = 1 case.Let φ : Y → S G be the 'decompression function' defined by Let Y = φ −1 (Z), so that φ : Y → Z is a surjective block map. 5 A visualization of φ is shown in Figure 1.
Intuitively, the CA f φ simulates f on the compressed configurations of Y , as long as their φ-images are clustered.

Proof (of claim).
Observe that for each z ∈ Z and g ∈ G there is at most one configuration y ∈ Y such that y g = 0 and φ(y) = z.Let then 1 = h 1 < h 2 < h 3 < • • • be any total order on the group G, not necessarily in any way compatible with its algebraic structure.Then we can define a map f φ with the desired properties as follows.
First, for the all-0 configuration 0 G ∈ Y , we define f φ (0 G ) = 0 G , and for all y ∈ Y such that f (φ(y)) / ∈ Z, we also define f (y) = 0 G .For all other y ∈ Y , let g ∈ G be the unique element with y g = 0, and let W ⊂ Y be the set of configurations y ∈ Y with φ(y ) = f (φ(y)).The set W is nonempty since φ : Y → Z is surjective, and it is finite because the unique nonzero coordinate of each y ∈ W is among the coordinates gh where h ∈ B G ((k − 1)c + r), since we assumed P 1 G = 0 for each P ∈ K. Now, we choose f φ (y) to be the unique configuration y ∈ W with y gh = 0, where h ∈ G is minimal in the ordering h 1 < h 2 < • • • .It is easy to check that f φ is then continuous and shiftcommuting.In fact, from the way we defined it, we see that its radius is at most (k − 1)c + r.
Recall the clustered configuration x ∈ S G .We have x ∈ Z by the definition of Z, so there exists a configuration y ∈ Y such that φ(y) = x.By the above claim, we have φ(f n φ (y)) = f n (x) for all n ≤ N .Since Y is a 1-sparse subshift with alphabet of size |K|+1 and f φ is a CA on it with radius at most (k −1)c+r, we have by Case 1 of this proof.We also remark that if we have N > |K|, then the configuration f n (x) is clustered for all n ∈ N, since there exist i < j ≤ N such that f i φ (φ(y)) is a translated version of f j φ (φ(y)).It remains to prove a variant of the above formula for f , and for that, let f n (x) g = 0 for some g ∈ G. Since the block map φ has radius (k − 1)c, we have φ(f n (x)) gh = f n φ (y) gh = 0 for some h ∈ B G ((k − 1)c).Equation (1) implies that y gh h = 0 for some h ∈ B G (d(1, |K|, (k − 1)c + r)), and from the definition of φ it follows that x gh h = 0 as well, since (y gh h ) 1 G = 0. We have shown that if f n (x) contains a nonzero symbol in some coordinate, then there is a nonzero coordinate in x at distance at most d(1, |K|, (k − 1)c + r) + (k − 1)c.Note that the cardinality of K is at most exponential in (k − 1)c.

Case 3: scattered configurations
Suppose finally that the configuration f n (x) is scattered for some n ∈ N, which we assume to be minimal.By the remark at the end of Case 2, we have n ≤ |K|.We apply Lemma 2 to the metric space A(f n (x)), and obtain a partition for it into sets C, D ⊂ G with distance at least c.
Denote y = f n (x).We define a partition of the configuration y by y = y C + y D , where (y C ) g = y g when g ∈ C and (y C ) g = 0 otherwise, and y D is defined analogously.By the definition of c, we have c ≥ d(|C|, q, r) + d(|D|, q, r) + r.It is then easy to see that f n (y) = f n (y C ) + f n (y D ) for all n ∈ N. In particular, if we have f j (y) g = 0 for some j ∈ N and g ∈ G, then y gh = 0 for some h ∈ B G (max <k d( , q, r)) ⊂ B G (d(k − 1, q, r)) by the induction hypothesis.Since we have n ≤ |K| and the CA f has radius r, this implies that x ghh = 0 for some h ∈ B G (r|K|), which implies hh ∈ B G (r|K| + d(k − 1, q, r)).
Putting all three cases together, we can define the function d recursively by for all k > 1.
The bounds we give are not very strong, but at least one can check that if the torsion function T G is primitive recursive, then so is the function d.
Theorem 2. If G is finitely generated, infinite and torsion, and X ⊂ S G is sparse and nontrivial, then X / ∈ S(G).
Proof.Let A be a group-walking automaton and Y its associated subshift, and let Since X × Y is sparse, Lemma 3 implies that any head of A can only travel a bounded distance on any configuration of X × Y .Then, for all x ∈ X and all but finitely many g ∈ G, the configuration x + (g • x) is rejected by A if and only if x is.If the support of x is maximal, this configuration is not in X.Thus A does not define X.
Lemma 3 also restricts the structure of the automorphism group of a sparse subshift on a torsion group.Theorem 3. If G is torsion and X ⊂ S G is sparse, then Aut(X) is also torsion.
The last theorem has an obvious converse: if G is not torsion, then the shift along a copy of Z is a non-torsion element of Aut(X) whenever X is sparse and nontrivial.One can construct such examples even in the quotient group Aut(G)/ σ R g | g ∈ G .

Undecidable word problem
If the word problem for G is not necessarily decidable, one can give multiple definitions of Π 0 1 .We give two, both of which correspond to our previous definition of Π 0 1 when the word problem is decidable.Recall that finite patterns are represented computationally as lists of pairs drawn from s(G) * × S. Definition 5. A subshift on G is Π 0 1 if there exists a Turing machine enumerating a set of (possibly inconsistent) forbidden lists of word-symbol pairs for it.Definition 6.A subshift X on G is intrinsically Π 0 1 if there exists an oracle Turing machine that, given an oracle for the word problem of G, enumerates a set of consistent forbidden lists of word-symbol pairs for X.
In [2], what we call intrinsically Π 0 1 is called G-effective, and this notion was first defined and studied there.Its actual definition in [2] uses 'group-walking Turing machines', but it is also shown to be equivalent to Definition 6.The following results, the first of which is a direct corollary of Theorem 1, relate these classes of subshifts to the hierarchy of group-walking automata.
Theorem 4. If G is finitely generated, infinite and non-torsion, then S(G, 3) contains the class of Π 0 1 -subshifts.Theorem 5.If G is finitely generated, infinite and non-torsion, then S(G, 4) is exactly the class of subshifts on G which are intrinsically Π 0 1 .Proof.Clearly, all S(G, 4) subshifts are intrinsically Π 0 1 , since a Turing machine with an oracle for the word problem of G can simulate a multi-headed finite state machine on the group.The proof that S(G, 4) contains the intrinsically Π 0 1 subshifts is similar to that of Theorem 1, except that we must simulate a Turing machine with access to an oracle for G. Thus, we only need to describe how one can use four heads to check whether the identity 1 ∼ w holds for an arbitrary w ∈ s(G) * .For this, we use three heads to move by the letters of w, and leave the fourth head as a marker in the cell we started from.We return back on top of the fourth head if and only if 1 ∼ w.We can then move back by w −1 and pick up the fourth head.
From these results, we obtain a characterization of torsion groups.Lemma 4. The X S subshift is intrinsically Π 0 1 on every group.Theorem 6.Let G be a finitely generated infinite group.Then G is torsion if and only if S(G, 4) is not equal to the class of all intrinsically Π 0 1 subshifts.Proof.This follows from Lemma 4, Theorem 5 and Theorem 2.
Finally, we note that Lemma 4 requires the intrinsic notion of computability, as shown by the following corollary of [2, Proposition 2.3] (also proved in [7]).
Proposition 1.Let G be a recursively presented and finitely generated group, and S is a nontrivial finite alphabet.The subshift X G S is Π 0 1 if and only if G has a decidable word problem.

Future work and open questions
While we need four heads in the proof of Theorem 5, we are not able to separate the class S(G, 3) from S(G, 4) on any group G.We do have a general construction which separates these classes on all sufficiently complex torsion groups.Unfortunately, we do not know how to construct a group with the necessary properties, as the construction of torsion groups is quite complicated.Nevertheless, this leads us to believe that the classes are not always equal.Conjecture 1.There exists an infinite finitely generated torsion group G such that S(G, 3) S(G, 4).In particular, S(G, 3) is not always equal to the class of intrinsically Π 0 1 subshifts.
We know that if G is not a torsion group, then the hierarchy S(G, k) k≥1 collapses to the fourth level (if not earlier), and S(G, 4) is exactly the class of intrinsically Π 0 1 subshifts.On torsion groups, the hierarchy never reaches all intrinsically Π 0 1 subshifts, but we have not shown that it is infinite.We believe we have a general construction that proves exactly this, but it is relatively complicated, so for now we only state its conclusion as a conjecture.
Conjecture 2. If G is an infinite finitely generated torsion group, then the hierarchy S(G, k) k≥1 is infinite.Some very basic questions about the abelian cases were left open in [9].We have no progress on these questions.
We note that in [4], a slightly different model of multi-headed group-walking automaton is studied on the group Z 2 , and it is shown that in this model, twoheaded machines are strictly weaker than three-headed ones.It seems that the question is harder in our model.In [9], we only showed that S(Z d , 2) S(Z d , 3) holds for d ≥ 3.
g .A pattern (on G) is a function P ∈ S D , where D = D(P ) is a finite subset of G, called the domain of P .Each pattern P defines a cylinder set [P ] = {x ∈ S G | x| D = P }.The clopen (topologically closed and open) sets in S G are precisely the finite unions of cylinders, and form a basis for the topology.

Definition 4 .
A k-headed group-walking automaton on the full shift S G is a tuple A = ( k i=1 Q i , f, I, F ), where Q 1 , Q 2 , . . ., Q k are state sets not containing the symbol 0, I and F are finite clopen subsets of the product subshift Y = k i=1 X Qi , and f :

Fig. 1 .
Fig.1.The decompression function φ applied to a configuration y ∈ Y .We have chosen G = Z 2 here for simplicity, even though it is not a torsion group.Note that the alphabet of Y consists of certain patterns of X and the symbol 0.