What are GPGPUs?

GPGPUs are general-purpose graphics processing units. This means using graphics cards for doing general purpose computing. Because of the demand of modern graphics-intensive programs, graphics cards have become very powerful computers in their own right.

GPGPUs are particularly good at matrix multiplication, random number generation, FFTs, and other numerically intensive and repetitive mathematical operations. They can deliver 5–10 times speed-up for many codes with careful programming.

For more examples of applications that are well-suited to CUDA, a language that enables use of GPUs, the see the NVIDIA CUDA pages at

Submitting Batch Jobs

To use a GPU, you must request one in your PBS script. To do so, add a node attribute to your #PBS -l line or to your qsub command.

Here is an example that requests one GPU. Note that with one GPU you are automatically assigned two CPUs.

#PBS -l nodes=1:gpus=1
#PBS -q fluxg

Note that you must use nodes=1 and not procs=1 or the job will not run.

For most people, a single GPU will be sufficient, however, if you need multiple GPUs, you can request them. The two most common circumstances, are going to be when you want more than one GPU on the same node physical machine or you want to make sure that the GPUs are on different machines. To get two GPUs and guarantee that they are on the same machine, use

#PBS -l nodes=2:gpus=2,tpn=2
#PBS -q fluxg

In the preceding example, note that the number of nodes equals the number of GPUs, and the tpn=N should match them both and should not exceed 8, which is the maximum number of GPUs on a single machine. To request two GPUs on two different machines, you would use

#PBS -l nodes=2:gpus=2,tpn=1
#PBS -q fluxg

In this case, the number of nodes and the number of GPUs must be the same, but the tpn should be the number of GPUs per node.

Programming for GPGPUs

The GPGPUs on Flux are NVIDIA graphics processors and use NVIDIA’s CUDA programming language. This is a very C-like language (that can be linked with Fortan codes) that makes programming for GPGPUs straightforward.

NVIDIA also makes special libraries available that make using the GPGPUs even easier. Two of these libraries are cublas and cufft.

cuBLAS is a BLAS library that uses the GPGPU for matrix operations. For more information on the BLAS routines implemented by cuBLAS.

cuFFT is a set of FFT routines that use the GPGPU for their calculations. For more information on the FFT routines implemented by cuFFT

To use the CUDA compiler (nvcc), or to link your code against one of the CUDA-enabled libraries, load the cuda module (module av cuda will display the CUDA versions available),

$ module load cuda

This will give you access to the nvcc compiler and will set the environment variables that can be used to compile and link againstcublas, cufft, and other CUDA libraries in the ${CUDA_LINK} directory.

For more information on CUDA programming, see the documentation at The CUDA Developer CUDA Toolkit Documentation center.

The PGI compilers also come with a CUDA Fortran that makes it easier to use the GPU from inside Fortran code. This is invoked from the command line with an option, and there are Fortran directives to write CUDA code from within your Fortran. See Introduction to PGI CUDA Fortran on the PGI web site and the PGI CUDA Fortran page for more details.

CUDA-based applications can be compiled on the login nodes, but cannot be run there, since they do not have GPGPUs.