Multi-GPU, multi-node SPH implementation with arbitrary domain decomposition

E. Rustico, J. Jankowski, A. Hérault, G. Bilotta and C. Del Negro

Abstract: We present a restructured version of GPUSPH, a CUDA-based implementation of SPH. The new version is extended to allow execution on multiple GPUs on one or more host nodes, making it possible to concurrently exploit hundreds of devices across a network, allowing the simulation on larger domains and at higher resolutions. Partitioning of the computational domain is not limited anymore to parallel planes and can follow arbitrary, user-defined shapes at the resolution of individual cells, where the cell is defined by the auxiliary grid used for fast neighbor search. Continue reading

Intel Many Integrated Core (MIC) architecture and the UnTRIM code

Intel Many Integrated Core (MIC) architecture and the UnTRIM code, 11th UnTRIM Workshop, Trento, 19-21 May 2014.

The presentation describes the results of an investigation concerning the potential of porting the UnTRIM2 code (with subgrids, see Casulli and Stelling 2010) to the Intel MIC processors, the trade mark name Xeon Phi. Continue reading

Accurate vertical profiles of turbulent flow in z-layer models

F.W. Platzek, G.S. Stelling, J.A. Jankowski and J.D. Pietrzak

Abstract. Three-dimensional hydrodynamic z-layer models, which are used for simulating the flow in rivers, estuaries, and oceans, suffer from an inaccurate and often discontinuous bottom shear stress representation, due to the staircase bottom. We analyze the governing equations and clearly show the cause of the inaccuracies. Based on the analysis, we present a new method that significantly reduces the errors and the grid dependency of the results. Continue reading

Potential der Intel Many Integrated Core Architektur für die Flussmodellierung — Codes UnTRIM und Telemac

Jacek A. Jankowski

This technical report (in German) concerns the assessment of porting feasibility of codes UnTRIM2 and Telemac to the Intel Xeon Phi, i.e. MIC (Many Integrated Cores) architecture. German abstract follows. Continue reading

Adaptierung und Erweiterung von Casulli-Algorithmen für Parallelrechner mit Hardware-Beschleunigung und zur Anwendung von konservativen Advektionsverfahren

Jacek A. Jankowski

BAW internal R&D-project report, 2010-2012.

Abstract: The aim of the R&D-project is development and application of new programming paradigms in high performance computing through the adaptation of Casulli algorithms for arriving parallel computer architectures with hardware acceleration. Additionally, the existing advection schemes should be adapted for all flow regimes. Continue reading