Tag Archives: CUDA

Multi-GPU, multi-node SPH implementation with arbitrary domain decomposition

E. Rustico, J. Jankowski, A. Hérault, G. Bilotta and C. Del Negro

Abstract: We present a restructured version of GPUSPH, a CUDA-based implementation of SPH. The new version is extended to allow execution on multiple GPUs on one or more host nodes, making it possible to concurrently exploit hundreds of devices across a network, allowing the simulation on larger domains and at higher resolutions. Partitioning of the computational domain is not limited anymore to parallel planes and can follow arbitrary, user-defined shapes at the resolution of individual cells, where the cell is defined by the auxiliary grid used for fast neighbor search. Continue reading

Adaptierung und Erweiterung von Casulli-Algorithmen für Parallelrechner mit Hardware-Beschleunigung und zur Anwendung von konservativen Advektionsverfahren

Jacek A. Jankowski

BAW internal R&D-project report, 2010-2012.

Abstract: The aim of the R&D-project is development and application of new programming paradigms in high performance computing through the adaptation of Casulli algorithms for arriving parallel computer architectures with hardware acceleration. Additionally, the existing advection schemes should be adapted for all flow regimes. Continue reading

A hardware-accelerated parallel implementation of a two-dimensional scheme for free surface flows

Results of implementing a two-dimensional semi-implicit scheme for free surface flows applying CUDA for a Nvidia GPU. Continue reading

A hardware-accelerated parallel implementation of a two-dimensional scheme for free surface flows

J.A. Jankowski

Abstract: This contribution concerns the verification and performance assessment of a hardware-accelerated parallel implementation of an algorithm for the semi-implicit finite difference method for solving the vertically integrated shallow water equations including a non-linear treatment of wetting and drying and conservative advection schemes. Continue reading

Short info: GPU version of the 2Dxy scheme

Short info: GPU version of the 2Dxy scheme, 9th UnTRIM Workshop, Trento, 7-9 May 2012.

The ongoing developments of the GPU-implementation of the 2Dxy semi-implicit scheme for general free surface flows. Moving towards a generic application by providing a user interface and embedding the CUDA computational core in C++ structures, as well as code verification and validation. Continue reading