Canonical Form of Gray Codes in N-cubes

. In previous works, the idea of walking into a N -cube where a balanced Hamiltonian cycle have been removed has been proposed as the basis of a chaotic PRNG whose chaotic behavior has been proven. However, the construction and selection of the most suited balanced Hamiltonian cycles implies practical and theoretical issues. We propose in this paper a canonical form for representing isomorphic Gray codes. It provides a drastic complexity reduction of the exploration of all the Hamiltonian cycles and we discuss some criteria for the selection of the most suited cycles for use in our chaotic PRNG.


Introduction
The problem of designing Pseudo-Random Number Generators (PRNG) that satisfy the probabilistic properties to produce a uniform distribution is difficult.Moreover, the knowledge of the generation algorithm and any sequence of previously generated bits should not constitute a sufficient piece of information to predict the next generated bits without knowing initial conditions.In order to build such PRNG, some studies have focused on the use of chaotic systems [7,6,2].
In a previous work [4], some of the authors have proposed a PRNG based on random walk in a N-cube where a balanced Hamiltonian cycle has been removed, and its chaotic nature has been proved.Moreover, it has been shown that the removed Hamiltonian cycle should be balanced in order to produce more efficient PRNG.Balanced Hamiltonian cycles are cycles in which the numbers of occurrences of the traversed dimensions are equal or differ at most by 2. In [8], the authors have proposed an approach that provides a subset of all the Hamiltonian cycles.This approach is however undeterministic and the cardinal number of the produced subset is dramatically small compared to to the one of all the Hamiltonian cycles.In some sense, it is a partial solution of finding Hamiltonian cycles.
The undeterministic aspect of this approach has been tackled in [3] where we have proposed a particularization of it.This new procedure succeed to find balanced Hamiltonian cycles for any dimension N and solves this issue.Nevertheless, pursuing our objective to enhance the specification of the Hamiltonian cycles most suited to the use in our PRNG, we have been confronted to the fact that procedure detailed in [3] cannot produce all non-isomorphic balanced codes: it is indeed a particularization of a partial solution.
In [10], the author proposed an approach to produce all the cycles of a graph.This work may thus solve the problem of finding a large set of balanced Hamiltonian given a dimension N.However, the approach suffers from being too exhaustive and cannot be applied as soon as the dimension of the N-cube is larger than 5.One solution could be to study cycles, whose embedding into PRNG gives distinct behaviors, i.e. which do not belong in the same class w.r.t an equivalence relation.For that, this work proposes a canonical form dedicated to cycles and its application to the generation of a large set of balanced Hamiltonian cycles.
This paper presents these two elements.In the following section is presented the canonical form of Hamiltonian cycles, followed in Section 3 by the description of our novel algorithm.The practical interest of the algorithm is discussed in Section 4.

Canonical form of Gray codes
Let S N = {1, ..., N} 2 N , the set of sequences of length 2 N with values in {1, ..., N}.Let H N ⊂ S N , the set of sequences describing Hamiltonian cycles in a N-cube.Each of these sequences gives the succession of the dimensions followed by the path.Any Hamiltonian cycle of H N can be written as h = (h 1 , ..., h 2 N ).Also, we remind the reader that a Hamiltonian cycle in a N-cube is a Gray code.
We call the canonical form of a Hamiltonian cycle, an equivalent description of the cycle that is obtained, through a specific computation process, for all its isomorphic cycles.
Before describing our computation process of the canonical forms, we provide below an overview of the different cases of isomorphism between cycles.

Isomorphic cycles
Intuitively, Hamiltonian cycles are isomorphic to each other when the paths they describe can be topologically superposed.Indeed, a same Hamiltonian cycle can be expressed in many sequences according to some simple (global) transformations of the N-cube, leading to a set of isomorphic cycles.We list below the different transformations that can be applied to a sequence to produce isomorphic cycles.
First of all, it can be noticed that describing a cycle by the sequence of the traversed dimensions in the N-cube does not specify any starting vertex.So, a sequence does not represent only a single cycle but the 2 N cycles that are isomorphic up to the starting position in the N-cube.
In a similar way, applying a cyclic shift to a sequence, in any direction, is equivalent to change only its starting vertex, but this does not change the path topology.So, shifted sequences are also isomorphic cycles.
Moreover, as the N-cubes considered in the scope of this paper are not oriented, the direction of the cycle is not significant and then, an isomorphic cycle is obtained by inverting the order of a sequence.
Finally, cycles can also be isomorphic up to rotations/symmetries, which are obtained by renumbering the dimensions of the N-cube.For example, exchanging dimensions 2 and 3 in a 3-cube is similar to performing a 90 degrees rotation around dimension 1.In the following, that operation may also be referred to as the relabeling of a sequence since it only changes the dimensions labels.It is worth noticing that some dimensions relabelings are equivalent to the sequence inversion combined with a cyclic shift.
In order to define the canonical form of Hamiltonian cycle, we need to introduce some functions over H N .

Preliminary tools
Let R : H N → H N , the function that renumbers a Hamiltonian cycle h to a sequence R(h) by mapping the successive distinct values (dimensions) of h to the ordered values from 1 to N. So, the first value h 1 of h is necessarily mapped to 1, then the first distinct value in the remaining of h (that is (h 2 , . . ., h 2 N )) is mapped to 2, and so on.As function R applies a renumbering, it follows that ∀i, j ∈ {1, . . ., 2 N }, The effect of function R is to apply rotations/symmetries to a sequence, by relabeling the dimensions of the N-cube, in order to express it in a specific order of the traversed dimensions, without modifying topology of the path.So, this function is an automorphism on H N .
As an example, if we have N = 3 and the sequence h = (2, 3, , then R(h) = (1, 2, 3, 2, 1, 2, 3, 2).So, the dimensions 1, 2 and 3 are respectively replaced by (relabeled) 3, 1 and 2 (as shown in Fig. 1), where the three dimensions labels and the starting vertex are fixed.It can be seen that both sequences are isomorphic up to a rotation around dimension 1 and an orientation inversion.As the lexicographic order over sequences of length N provides a total order on H N , the results of R are totally ordered.So, for any subset X of H N there exists a unique minimal value of the results of R applied to any h ∈ X.This property is used in the computation of our canonical form.
Let D : H N × {1, . . ., 2 N } → H N , the function that associates to a sequence h = (h 1 , . . ., h 2 N ) and an integer k, the sequence , which is h after k − 1 successive left cyclic shifts, so that h k becomes the first value of the sequence.The effect of function D is simply to change the starting point of the sequence, without modifying the cycle itself, as can be seen on Fig. 2. As well as function R, this function is also an automorphism on H N and it is also used to compute our canonical form of isomorphic cycles.
In [1], Bykov uses this notion of minimal sub-sequence containing all the dimensions of the N-cube to define the window width of a sequence h.It corresponds to the maximal value of function W over all the possible starting points in the sequence.It provides information about the local balance between the traversed dimensions along the cycle.This window width can be defined by M (h), for h ∈ H N as :

Canonical form
The function C : H N → H N , defined by: produces the canonical form of any sequence from H N .Notice that this set is ordered according to the aforementioned lexical ordering, which is total.
The role of the C function is to provide a unique representative of for each class of Hamiltonian cycle.By class of Hamiltonian cycle, we mean the set of isomorphic Hamiltonian cycles according to translations (changing the starting point of the sequence) and rotations/symmetries (changing the dimensions labels).So, we have the following theorem.Proof.As both functions R and D are automorphisms on H N , the composite function R • D also is an automorphism on H N .Thus, for any integer k ∈ {1, . . ., 2 N }, sequence R(D(h, k)) is isomorphic to sequence h, and so is C(h).Also, this implies that for any two non-isomorphic sequences h and g in H N , there does not exist any couples of integers i and j in {1, . . ., 2 N } such that R(D(h, i)) = R(D(g, j)).Thus, the results of C(h) and C(g) are necessarily different when h and g are not in the same classes of isomorphic cycles.
However, there remains the question of uniqueness of the result of C for all sequences in a same class of H N .That property induces that for any two isomorphic sequences h and g in H N , there exist two integers i and j in {1, . . ., ), but we have to show that for some adequately chosen i and j, they are identical sequences.
As a first step, let us consider two sequences h = (h 1 , . . ., h l ) and g = (g 1 , . . ., g l ) that are isomorphic only up to rotations/symmetries.As such transformations can be expressed by dimensions relabeling, it follows that g and h are mutual relabelings of each others: and ∀i, j ∈ {1, . . ., l} Moreover, R(h) and R(g) are also respective relabelings of h and g.The fact that R(h) = R(g) is ensured by the ordered relabeling over {1, . . ., n}.Indeed, as the relabeling follows the numerical order of integers, it produces the same sequence for h and g according to the total lexicographic order over H N : and due to (4), we have: Thus, function R produces the same result for sequences that are isomorphic up to rotations/symmetries.The next step consists in taking into account cyclic shifts between sequences.Solving this problem is similar to finding a way to re-align all isomorphic cycles according to a common starting vertex.Fortunately, this is possible according to the notion of window width, previously introduced and expressed by functions W and M .Indeed, the window width discriminates the positions in a sequence, by identifying the ones with the highest local balance, that is to say the ones from which starts the longest minimal sub-sequence containing all values in {1, .., n}.Obviously, the window width is the same for all isomorphic cycles, as they have the same sequence of local balances up to a cyclic shift, whatever the labels of the dimensions.For any class of cycles, there is at least one position corresponding to the window width and we use it as the reference starting position to force the alignment of all cycles in the class.
When there is exactly one such position in a class, there is no ambiguity and every cycle of the class if shifted to begin at this position.However, for some classes, there might exist several positions corresponding to the window width.Thus, an additional deterministic selection must be applied to those possibilities.This is where the total lexicographic order is exploited, by selecting the position whose ordered relabeling produces the smallest sequence relatively to that order.This is what is expressed by the min operator in function C. As the result is a minimal value over a totally ordered space, it is unique and it ensures the common re-alignment of all the cycles in a same class.
So, function C re-aligns isomorphic cycles to a common starting position and relabels their dimensions in an ordered way that ensures a unique result for isomorphic cycles.
Finally, it is worth noticing that it is the use of the window width notion combined to cyclic shifts, the total lexicographic order over H N and the dimensions relabeling that allows us to compute a unique class representative.
So, the binary relation E induced by function C: is an equivalence relation over H N since C is a function.
As a last example, let us consider another cycle in H 4 that is not isomorphic to g and h.This is the case for f = (3, 1, 4, 1, 2, 1, 4, 1, 3, 1, 4, 1, 2, 1, 4, 1) because the numbers of occurrences of the dimensions are not equal in f , whereas they are for g and h.For the computation of C(f ), we have M (f ) = 8 and four corresponding starting positions: 2, 6, 10 and 14.All four positions produce the same result by R • D, shown in Fig. 3(c), and then C This illustrates the class separation realized by function C when there are several classes in the considered H N space, as non-isomorphic cycles lead to distinct results whereas isomorphic cycles lead to the same one.

Discussion over the interest of the canonical form
This work provides an efficient way to partition the H N space up to isomorphisms by computing unique representatives of the classes.Such partitions are very useful as soon as one wants to study properties of Gray codes in dimensions larger than 3, as it is possible to focus only on classes representatives.This lead to more efficient algorithms as the number of classes increases much slowly than the number of instances.Moreover, the total order over the class representatives can also be exploited to implement efficient storage and classification algorithms when exploring a given H N space.

Balanced Gray codes generation algorithm
We remind the reader that in balanced Gray codes, the dimensions of the N-cube are used a same number of times or at most with a difference of two occurrences.When all the dimensions are used exactly the same number of times, we speak of totally balanced Gray codes.This is only possible for N-cubes whose dimension is a power of 2.
In order to generate the complete subset of balanced Gray codes in a given H N space, we have adapted the (d, g)−algorithm proposed by Wild [10] to generate all the Hamiltonian cycles.As this algorithm produces more cycles than the ones we are interested in, we had to insert an additional selection phase during the generation process in order to discard branches that would lead to imbalanced Hamiltonian cycles.
That additional selection can be placed before the other treatments (coherency, small cycles elimination,...) applied to each generation node (in the generation tree).By this way, it cuts any unproductive branch as soon as possible, thus avoiding useless computations.
That selection consists in checking that the occurrences of the dimensions already used in the partial construction of the cycle are compatible with a balanced cycle.When this is not the case, the candidate is discarded.To check this, we compute two values that are respectively, the maximal number of occurrences allowed per dimension in a balanced code (O), and the maximal number of dimensions (D) with that specific number of occurrences.
Those two numbers can be directly deduced from the dimension N of the N-cube: The imbalance detection algorithm is given in Alg.The imbalance is detected as soon as the number of occurrences of one dimension exceeds O or the number of dimensions having reached O exceeds D.
Two other algorithmic enhancements may be added to the process.The former is a treatment of the nodes in the generation tree that aims at speeding up the descent towards the leafs, by jumping several levels in the tree in a same iteration.The latter is quite an extension of the former as it consists in starting the generation process not at the root of the tree but several levels deeper.However, experiments show that such additions do not systematically reduce the cost of the algorithm.A deeper study is necessary to precisely determine the impact of those additions.
Finally, all the paths that are totally specified within the generation process (the leaves of the generation tree) are transformed into their canonical form.That form is added to the lexicographically ordered list of balanced Gray codes if not already present.
So, we obtain an algorithm that generates all the non-isomorphic balanced Gray codes in a given H N space.

Application
The first series of experiments is dedicated to the validation of the canonical form previously presented.Then, the second part is dedicated to the balanced Gray code generation algorithm.

Validation of the canonical form and the generation algorithm
The first set of experiments consists in checking the completeness of the obtained generation algorithm described in Section 3. So, this algorithm is used to experimentally retrieve all the classes in N-cubes up to dimension five.For larger dimensions, the number of distinct cycles is too large to be exhaustively computed (777739016577752714 for H 6 ).
For each set H N , all Hamiltonian cycles are generated by the algorithm without activating the balance selection.Then, canonical forms of the cycles are computed according to C in order to deduce the distinct classes in the space.
The numbers of resulting classes have been compared to the references provided in [1] and initially coming from [5].Our algorithm has successfully found a unique class for dimensions 2, and 3.It found 9 classes for dimension 4 and 237675 classes for dimension 5.These results confirm the completeness of the generation algorithm.

Application of the balanced Gray code generation algorithm
In theory, the presented algorithm can generate all the balanced cycles for a given dimension of N-cube.However, this is not pertinent in practice due to the exponential increase of the number of cycles.In such case, any algorithm would be confronted to two limitations: memory and execution time.For example, our algorithm can generate all the balanced Gray codes for dimensions up to 5 in a few seconds whereas it would take non reasonable time to generate all the cycles for dimensions 6 and above.
Indeed, in our application context of PRNGs, we need only to generate some particular balanced cycles, according to the regarded properties.It is then possible to restrict the search to some particular cycles.So, it should be possible to obtain a fast algorithm for generating specific balanced Gray codes.
Moreover, compared to other methods to generate balanced Gray codes, like the extended Robinson-Cohn (further denoted as e-RB) algorithm [9] or the Bykov's one [1], our approach presents the advantage of being more complete, and thus more flexible.It is able to find any balanced cycle that has some specific properties, namely which is locally balanced and whose mixing time (time until the Markov chain is ε close to the uniform distribution) is reduced.
As a first example, if we consider dimension 5, the e-RB method can only generate 2 balanced cycles (modulo cycle isomorphism), given in Table 2.The cycles are given in canonical form and the numbers in the left column correspond to their positions in the totally ordered set of all balanced cycles for dimension 5 (26155).Both cycles have a local balance of 12 and a mixing time of 31 where ε is 10 −6 .However, for this dimension, the minimal local balance is 7 (only one cycle) and the best mixing time is 29 (several cycles with different local balances).All those cycles are listed in Table 3, together with their local balance and mixing time.So, it is clear that our method is better suited to find cycles of interest for the construction of PRNGs.Table 3: Excerpt of the 26155 non isomorphic Hamiltonian cycles generated by our method with either the smallest local balance or the smallest mixing time with ε = 10 −6 for dimension N = 5.
A second example is related to the Bykov's construction of locally balanced cycles.The proposed algorithm builds a family of Hamiltonian cycles in a N-cube with a specific local balance of at most n + 3.log 2 (n).However, Table 3 shows us two facts.The former is that only two cycles among the 7403 with this particular local balance (11 for dimension 5) obtain the minimal mixing time.The latter is that this minimum is reached also by some cycles with other local balances (10 and 12).Thus, a more exhaustive algorithm, like the one we propose, is useful to get all the cycles better suited to the inclusion in a PRNG and to provide a wider choice.

Conclusion
A canonical form has been proposed to provide unique representations of Hamiltonian cycles in N-cubes.All the properties of an equivalence relation over the set H N have been proved.Based on this form and the Wild's algorithm that generates cycles in graphs, a new algorithm has been designed to generate all the balanced Hamiltonian cycles in any N-cube.Restrictions to specific cycles can be used to limit the generation and to avoid the combinatorial explosion on the number of cycles for dimensions greater than 6.
In the application context of PRNG construction, we have shown that our algorithm is better suited than other existing methods that generate only specific cycles, like the extended Robinson-Cohn and the Bykov ones.
Hence, our algorithm provides a useful tool to study the cycles properties that are relevant to the inclusion in a PRNG.This study is planned as our next work, together with performance optimization of our generation algorithm.

Theorem 1 .
For any cycles a and b in H N , C(a) = C(b) if and only if a and b are isomorphic cycles.

Fig. 3 :
Fig. 3: Application of R • D to cycles from H 4 .

Table 2 :
The 2 balanced cycles generated by e-RB method in dimension 5 and their corresponding mixing time when ε is 10 −6 .sequence of traversed dimensions of the N-cube local 1.1.
1 Input: a partially built path p 2 Output: a boolean indicating True if the path is imbalanced and False otherwise 4 Initialize array od[] of size n with 0 5 nbD ← 0 // Number of dimensions with max occurrences 6 imb ← False // We start with balanced path assumption 8 for each valid move in p do 9 get the dimension d along which the move is done 10 od[d] ← od[d] + 1 // move added to occurrences of d 11 if od[d]> O then // too much moves on dimension d 12 imb ← True // imbalance 13 else 14 if od[d]= O // dim d reaches max occurrences 15 if nbD= D then // too much dims with max occs