We describe the implementation of a thermal compressible Lattice Boltzmann algorithm on an NVIDIA Tesla C2050 system based on the Fermi GP-GPU. We consider two different versions, including and not including reactive effects. We describe the overall organization of the algorithm and give details on its implementations. Efficiency ranges from 25% to 31% of the double precision peak performance of the GP-GPU. We compare our results with a different implementation of the same algorithm, developed and optimized for many-core Intel Westmere CPUs.
An Optimized D2Q37 Lattice Boltzmann code on GP-GPUs
MANTOVANI, Filippo;PIVANTI, Marcello;SCHIFANO, Sebastiano Fabio;TRIPICCIONE, Raffaele
2013
Abstract
We describe the implementation of a thermal compressible Lattice Boltzmann algorithm on an NVIDIA Tesla C2050 system based on the Fermi GP-GPU. We consider two different versions, including and not including reactive effects. We describe the overall organization of the algorithm and give details on its implementations. Efficiency ranges from 25% to 31% of the double precision peak performance of the GP-GPU. We compare our results with a different implementation of the same algorithm, developed and optimized for many-core Intel Westmere CPUs.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.