We describe a novel practical parallel FFT scheme designed for SIMD systems and/or data parallel programming. A bit-exchange of elements between the processors is avoided by means of the 'Transpose Algorithm'. Our transposition is based on the assignment of the data field onto a 1-dimensional ring of systolic cells which subsequently is mapped onto a ring of processors, realized as a subset of the system's connectivity. We have implemented and benchmarked a 2-dimensional parallel FFT code on the APE100/Quadrics parallel computer, where—due to a rigid next-neighbour connectivity and lack of local addressing—efficient FFT implementations could not be realized so far.
Transpose Algorithm for FFT on APE/Quadrics
Tripiccione, R
1998
Abstract
We describe a novel practical parallel FFT scheme designed for SIMD systems and/or data parallel programming. A bit-exchange of elements between the processors is avoided by means of the 'Transpose Algorithm'. Our transposition is based on the assignment of the data field onto a 1-dimensional ring of systolic cells which subsequently is mapped onto a ring of processors, realized as a subset of the system's connectivity. We have implemented and benchmarked a 2-dimensional parallel FFT code on the APE100/Quadrics parallel computer, where—due to a rigid next-neighbour connectivity and lack of local addressing—efficient FFT implementations could not be realized so far.I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


