Jacek A. Jankowski. Developments concerning parallel computing with the Telemac System, 10th Telemac User Club, Chamrousse, 9-10 October 2003.
BAW is one of the a few Telemac users applying the code on high performance computers (HPC), presently all of them parallel, in their sites in Hamburg (SGI Origin 3000), Karlsruhe (SGI Origin 2000) and also by DWD in Offenbach (IBM RS/6000 SP). The presentation addresses the specific methodology, issues and developments concerning HPC:
- dealing with larger models in higher resolution, including mesh partitioning with programs partel or hansel and merging of the results with gretel application,
- scalability and serial code performance on the above mentioned machines,
- application-level checkpoint and restart without applying the LSF batch system, tested presently for AIX, IRIX and Linux.
- returning exit codes from Telemac runs in order to embed the code properly in scripts, batch systems, distributed computing software, etc. as well as the correct application signalling and killing, separating error and listing channels,
- applying the code work directory on dedicated file systems,
- grid computing applying Unicore, especially when accessing DWD machines.
As a conclusion the adoption of the developments presently seemingly only specific for the parallel computing and HPC-usage aspects in the standard Telemac system software is requested.
Transparencies are available.