Next: 4. New parallel implementation Up: Parallel fast Fourier transforms Previous: 2. Fast Fourier transforms

3. Traditional parallel implementation

The traditional distribution of data in electronic structure calculations is shown schematically in Fig. 1 for the case of a $4 \times 6 \times 8$ grid and 4 nodes. When applying the potential to a trial eigenvector, the data is initially represented in momentum-space (on the left-hand side of Fig. 1) and each node deals with a number of ``rods'' of data in the

-direction. In the first stage of the 3D-FFT, each node performs a 1D-FFT in the

-direction on each of its rods. The nodes then communicate to effect a transpose in which the data is redistributed from ``

-rods'' to ``

-rods'' (middle of Fig. 1). Each node then performs a second 1D-FFT in the

-direction on these rods. A second communication stage transposes the data to ``

-rods'' (right of Fig. 1), and the final stage is to perform a 1D-FFT on these

-rods. The DFT from real- to momentum-space is performed similarly by reversing these operations.

**Figure 1:** Distribution of data for traditional implementation.
$\includegraphics [height=58mm]{old.eps}$

Next: 4. New parallel implementation Up: Parallel fast Fourier transforms Previous: 2. Fast Fourier transforms

Peter Haynes