Appendix

 
 
Algorithms

 
 The aim of these notes is not to cover in detail all of the algorithms used to iterate automata networks and study their dynamics, but rather to give the necessary background needed to begin to program on a microcomputer the examples discussed in the first four chapters. Numerical simulation is the favorite approach in automata networks; it plays somewhat the same role as experiments in other fields. The phenomena are first observed with the help of simulations, and are then interpreted, if possible, by the theory.

CODING

Configurations

Generally speaking, the decimal code $d(c)$ of a configuration $c$ of $k$ automata denoted by the subscript $j$ is written:
 
 
\begin{displaymath}d(c)=\sum_{j=0}^{k-1}2^j e(j)\end{displaymath} (1)
where $e(j)$ is the state of the automaton subscripted by $j$. Conversely, the state configuration of automata $e(j)$ is the binary code for $d$. The correspondance is bijective. The $e(j)$ are the remainders of the successive divisions of $d$ by 2. Table 2.2 gives the decimal codes and the corresponding binary configurations from 0 to 31.

Boolean functions

We start by coding the input configurations using the decimal code. In the general case, if the decimal code of each input configuration, denoted $d$, varies from 0 for the $000\ldots 0$ configuration to $2^k-1$ for the$111\ldots 1$ configuration, the code $f$ of the boolean function is given by:
 
 
\begin{displaymath}f = \sum_{d=0}^{m-1}2^j o(d)\end{displaymath} (2)
where $o(d)$ si the output which corresponds to the input configuration$d$, and $m=2^k$.

As an example, we code the 16 boolean automata with two inputs from 0 to 15. We choose as the most significant bit the one which corresponds to the input configuration 11 and as the least significant bit the one which corresponds to 00.

The AND function, for example, has a code of 8:

\begin{displaymath}code(AND)=0\times2^0+0\times2^1+0\times2^2+0\times2^3=0\times1+0\times2+0\times4+0\times8=8\end{displaymath} (3)
and the OR function, 14:
\begin{displaymath}code(OR)=0\times1+1\times2+1\times4+0\times8=14\end{displaymath} (4)
The exclusive OR function has a code of 6, and the NAND, 7. Table 2.1 gives the decimal codes and the truth tables of the 16 boolean functions with two inputs. Some examples of the coding of boolean functions with three inputs can be found in chapter 3, section 3.2. These codes allow the functions of the boolean automata to be written in a simple and standard way, as well as to be easily stored and manipulated in the computer.

DEFINITION OF THE NETWORK

A network is defined by:

1. Knowing the inputs of each automaton.

In the general case, for an arbitrary connectivity, each input of each automaton is given by a matrix: in the case of automata with two inputs, two matrices, called for example $e1$ and $e2$, define input #1 automaton and input #2 automaton for each of the automata.

For the example of figure 2.4:

$\displaystyle e1(1)=2$ $\textstyle e1(2)=3$ $\displaystyle e1(3)=4$ (5)
$\displaystyle e1(4)=5$ $\textstyle e1(5)=6$ $\displaystyle e1(6)=1$ (6)
$\displaystyle e2(1)=2$ $\textstyle e2(2)=3$ $\displaystyle e2(3)=4$ (7)
$\displaystyle e2(4)=5$ $\textstyle e2(5)=6$ $\displaystyle e2(6)=1$ (8)

In general, it is not necessary to define the outputs of each automaton.

In the case of cellular connectivity, it can be simpler to recalculate the input automata each time rather than to store them in memory.

2. The choice of the transition rules of the automata.

In the case of boolean automata, the result of the transition rule must be given for each automaton and for each input configuration. We then use a matrix with subscripts $i$ and $j$, where one subscript refers to the automaton and the other to the input configuration. This second subscript is the decimal code of the input configuration, incremented by 1. For two inputs:
 
 

\begin{displaymath}j = 1 + eta(e1(i)) + 2 \times eta(e2(i))\end{displaymath} (9)
where eta gives the state of the automaton. We add a 1 so that $j$ varies from 1 to $2^k$, where $k$ is the connectivity of the network.

3. The choice of the mode of iteration (see the corresponding programming principle below).

THE INITIAL CONDITIONS

1. Initially, the time is set to 0.

2. The initial configuration of the network must be chosen. Depending on the application, it can be:

$\bullet$ programmed, for simple cases of a homogeneous configuration, or of a random configuration. In the case of an exhaustive study to determine the iteration graph, the decimal code of the configuration is the index of the initial condition loop.

$\bullet$ input from the keyboard, using a text editor or a graphical program.

$\bullet$ or finally, it can be determined by external data gathered by a set of sensors, in the case of an application to image recognition or more generally to patterns.

ITERATIONS

The central part of the program is the iteration of the transition functions. This is what takes the most execution time, even though the number of programming lines is often quite small. Consequently, this is the part of the program that it is most worthwhile to optimize. An external loop increments the time. An internal loop, subscripted by $i$, the number of each automaton, gives the new state of the automaton as a function of the states of the connected automata at the preceding time. After having evaluated $j$, the code of the input configuration of an automaton, as defined earlier, its new state, neta$(i)$, is calculated by the transition function $f(i,j)$:
 
 
\begin{displaymath}neta(i)= f(i,j)\end{displaymath} (10)
There are two cases:

$\bullet$ for iteration in series, the state of the automaton is updated directly and we proceed to evaluate the following automaton. The preceding equation can therefore be written:

\begin{displaymath}eta(i)= f(i,j)\end{displaymath} (11)
$\bullet$ for a parallel iteration, the new states of all of the automata must have been calculated before the input configurations can be modified. Two iterations on the automata must then be performed, the first to calculate all of the new states (matrix neta$(i)$), and the second to update the matrix of the states which will be used to calculate the input configurations for the next iteration:
\begin{displaymath}eta(i)= neta(i)\end{displaymath} (12)

Determining the iteration graph

The idea behind this method is to color the nodes of the iteration graph. We use a ``graph'' vector, whose subscript is the decimal transcription of each of the $2^N$ configurations. (The size of this matrix is the limiting factor for this method.) To each element of this matrix will be assigned a whole number specific to the basin of attraction, the color of the node.

We initially set the color of the matrix to 0. Starting from the first configuration, we iterate the state change functions, placing 1's in the elements of the vector graph which correspond to the nodes which have been reached. Of course, we look at the state of each node before changing it: if it is equal to 1, the limit cycle has been reached.

The algorithm continues, incrementing the color of the nodes each time we look for a new basin, starting from the smallest configuration whose color is still 0. The iteration of the state change functions stops each time the color differs from 0: we have found a new basin of attraction if the color found is the same as that of the initial condition we started from. If the color is smaller, we are falling into a basin that has already been found. The preceding color must therefore be reassigned, starting from the last initial condition. The retrace operations are faster if we keep a successor matrix which stores the successor of each configuration. Similarly, during the retrace, one can increment a counter at each step to measure the periods.

The algorithm is finished when all of the nodes are colored. The highest color gives the number of attractors.

THE STATISTICAL APPROACH

If the number of automata is greater than thirty, the preceding algorithm, which requires a memory size of the order of $2^N$ and a time of the order of $N2^N$, is inapplicable. The statistical method consists of starting from randomly chosen initial conditions and collecting dynamical data.

To find a limit cycle, a reference must be stored in memory: this is the configuration reached by the network after an arbitrary time which is assumed to be greater than the transient period. The algorithm consists of taking the reference at time tref, and comparing it to all of the configurations obtained. A counter can be used to measure the period. It is possible that the reference may have been taken too soon: in fact the comparison is done between times tref and 2$\times$tref. If the limit cycle has not been detected at 2$\times$tref, we take a new reference at 2$\times$tref and we multiply the search time by 2. The operation ``new reference-new search'' continues until the limit cycle is reached, unless we decided a priori to interrupt the search at a maximum time.

To estimate the number of attractors, we must compare the configurations reached during a limit cycle to the configurations obtained for a previously determined limit cycle. If number of attractors is high, this can be a long search. This is why certain decimal coding are sometimes used.

CELLULAR AUTOMATA AND BOUNDARY PROBLEMS

The algorithms proposed in the general case can of course be applied to cellular automata. They can be simplified by taking advantage of the regularity of the connections and the uniformity of the state change functions.

It is not necessary to keep connection matrices representing the inputs of the automata, since each matrix can generally be obtained by a simple translation from the position of the automaton under consideration. A small difficulty comes up due to the finite character of the lattice used in simulations. Cellular automata are defined for infinite lattices, whereas the simulated lattices are limited by boundaries. There are several ways of dealing with boundary effects. One way, for example, is to fix the states of the automata on the boundaries to arbitrary values. The most frequent solution used in simulations consists of connecting opposite edges to each other, which has the advantage of being less rigid than the previous solution. A linear lattice is then forms a circle, and a higher dimensional lattice forms a torus or a hypertorus.

To take into account the conditions at the edges, we add two ``surfaces'' inside the lattice (in fact, an automaton at each edge for one dimension, two rows of automata in two dimensions, etc.). The iteration loops which calculate the new state, neta$(i)$, only apply to the internal lattice. The states of the edges are only updated during the second loop, which updates the vector eta$(i)$; and they then take on the values of the new eta calculated for the ``real'' edge to which they are connected. For example, if a linear lattice of 10 automata is being simulated, these automata are numbered from 2 to 11. We add to them a left edge, numbered 1, and a right edge, numbered 12. At the beginning of any loop from 2 to 11 to update neta$(i)$, eta$(1)$ is equal to eta$(11)$ and eta$(12)$ is equal to eta$(2)$. In this way, the state of automaton 11, the farthest to the right, is set as function of a right-hand signal sent by automaton 2, which is the farthest to the left. Of course, if the connections involve more than the nearest neighbors, the size of the edges is correspondingly increased.

Regardless of the dimension of the lattice, each automaton can be referenced by a single whole number. For a cube with dimensions $L$$M$, and $N$, the subscript $i$ of the automaton in a vector eta, for example, is then given by:
 
 

\begin{displaymath}i = u + v \times L + w \times L \times M\end{displaymath} (13)
where $u$$v$, and $w$ are the coordinates of the automaton in the cube. It follows that $i$ goes from 1 to $L\times M\times N$. Its six nearest neighbors have subscripts: $i\pm 1$$i\pm L$$i\pm L\times M$. If this representation is chosen, it is then convenient to update the edges by interconnecting the edges with a single loop. For the preceding cubic lattice, $L\times M$ automata are placed in the first position, which constitutes the left side. Then, the automata whose subscripts go from$L\times M+1$ to $L\times M\times(N+1)$ constitute the real lattice. Finally, $L\times M$ automata are added to the right edge. It follows that$i$ goes from 1 to $L\times M\times (N+2)$. neta$(i)$ is iterated for $i$ going from $L\times M+1$ to $L\times M\times(N+1)$. The left edge is then updated according to:
 
 
\begin{displaymath}eta(i) = eta(i+ L \times M \times N)\end{displaymath} (14)
for $i$ going from 1 to $L\times M$. The equivalent procedure is used for the right edge. In fact, this method is equivalent to deforming the cube into a super-helix rather than into a torus. It has the advantage of being programmable in a single loop, regardless of the dimension of the lattice, and the size of the approximation compared to an infinite lattice is the same as for the cube.

If the lattice is one-dimensional, its successive different configurations can be displayed one after the other, according to Wolfram's representation. Rather than ``generating'' boring tables of 0's and 1's, it is more readable to code 0's by blanks or dots and 1's by plusses + or asterisks *. Word processing printers can sometimes cause problems if they print proportional characters: a constant spacing must be set so as to not scramble the output table. Of course, we can also choose a graphical mode of representation, which has sometimes been done in this book, representing the states of the automata by white or black dots.

If the lattice is two-dimensional, a powerful computer is needed to observe the output in ``real time,'' otherwise visually we can only see a sweep across the screen corresponding to the sequential update of the pixels. By using certain codes based on an alphanumeric representation, or on the color, we can integrate information about several time intervals within a single table (see figures 4.5 or 10.4).

THE GRADIENT METHOD

This is the classical method which is used to find the minima of a function of several variables.

Assume we want to find the minimum of $y$, a function of a single variable:

\begin{displaymath}y = f(x)\end{displaymath} (15)
The idea is to start from an arbitrary point $x_0$ (or a cleverly chosen point when it is possible) and to move in successive steps toward the minimum. The calculation of the displacement from a position $x_n$ to the following position $x_{n+1}$ depends on the slope of the function in the neighborhood of $x_n$. The steeper the slope, the larger the displacement; a shallow slope indicates that we are near the minimum, whereas a steep slope indicates that there is still a long way to go. The algorithm consists of moving at each step according to:
\begin{displaymath}x_{n+1} = x_{n} + \epsilon \times f'(x_{n})\end{displaymath} (16)
where $\epsilon$ is a carefully chosen multiplicative coefficient. At each step we test the derivative $f^\prime(x)$ of $f(x)$ and we stop the iteration when $f^\prime(x)$ is sufficiently close to 0. In fact, one of the difficulties of this method rests in the choice of $\epsilon$. If$\epsilon$ is too small, a large number of iterations are needed before a reasonable approximation to to the minimum is obtained. If $\epsilon$ is too big, we are in danger of oscillating around the minimum or of finding ourselves in a different valley, and thus losing track of the minimum we were looking for.

In the case of a function of several variables, $x$$y$$z$,...the method can be generalized by using the gradient of $f(x,y,z,\ldots)$. The iteration formula then becomes:

$\displaystyle x_{n+1} = x_{n} + \epsilon \times f'_x(x_{n},y_{n},z_{n})$     (17)
$\displaystyle y_{n+1} = y_{n} + \epsilon \times f'_y(x_{n},y_{n},z_{n})$     (18)
$\displaystyle z_{n+1} = z_{n} + \epsilon \times f'_z(x_{n},y_{n},z_{n})$     (19)

The notation ${f^\prime}_x$ represents the partial derivative of$f(x,y,z,\ldots)$ with respect to $x$. The displacements in the space of the variables are then proportional to the gradient of the function $f$ and take place along the line of steepest slope. Of course, the difficulties concerning $\epsilon$ are the same as for the one-dimensional case.

Gérard Weisbuch 2003-05-30