Next: Acknowledgement
Up: Parallel fast Fourier transforms
Previous: 7. Load balancing
8. Conclusions
We have presented a new method for performing FFTs on parallel
computers which scales to a larger number of nodes than the
traditional method due to the reduced latency cost. This is achieved
by taking advantage of the inherent data distribution required by the
FFT algorithm. The method is applicable to electronic structure
calculations, due to the small sizes of FFT grids used, and is most
effective on clusters of workstations where the communication costs
are high. The new method automatically satsifies the demand of load
balancing, and effectively blocks the Hamiltonian matrix which may
allow new iterative diagonalisation algorithms for block matrices to
be applied.
Peter Haynes