Cellular Automaton - CA
The cellular automaton (CA) simulation is a simple 9-point stencil calculation similar to the popular game of life by John Horton Conway. Although this application is simple, it represents a large class of parallel applications. The Picture below shows a screenshot of the cellular automaton after 500 iterations.
Image
The Stencil Calculation
The picture illustrates how the subsequent state of a cell (blue) is calculated. The state of all 8 surrounding cells and the cell itself is taken into account (U - Upper, L - Lower, C - Center, L - Left, R - Right).
Image
To ease the calculation of border cells the outer cells are virtually connected together and build a 2D-torus. This connection is implemented with ghost zones around the cell area to hold the corresponding state of the cell from the other border (see the Picture).
Image
Parallel Version
A parallelization is done via a 1-dimensional domain decomposition (horizontal cuts). This results in a horizontal slice for each compute node. Now, communication is required due to the stencil neighbour's state. The data communication pattern is depicted in the picture. Only neighbour nodes need to communicate.
Image
Implementations
We derived several versions from the original cellular automaton, provided by the book:
Parallele Programmierung mit MPI - ein Praktikum -
Peter Sanders and Thomas Worsch
ISBN: 3-931216-76-4
Logos Verlag Berlin GmbH, 1997
The parallel implementations focus on MPI. The following verions were implemented:
- MPI1 blocking/synchronous communication
- MPI1 blocking/synchronous using MPI_Sendrecv
- MPI1 non-blocking/asynchronous communication
- MPI2 one-sided asynchronous communication with post-start-complete-wait synchronization scheme
- NEON one-sided asynchronous communication