CUDA Multi-Process Service (MPS)
CUDA MPS [1] is an alternative to the CUDA application programming interface (CUDA API). The MPS runtime architecture is designed to transparently enable the use of Hyper-Q to CUDA applications that require the use of multiple cooperative processes, usually MPI jobs, on NVIDIA graphics processing units (those based on the Kepler architecture). Hyper-Q allows CUDA kernels to be processed concurrently on the same GPU; thus benefiting performance when it is sub -used by a single process
Versions