GPGPU Programming
This page talks about various programming techniques by which users can write general purpose code which executes on the GPU.
By NVIDIA
CUDA
Nvidia CUDA was launched in 2006.
This allowed users to use GPUs for more general purpose applications than just rendering the graphics.
CUDA is as low as you can get, so the best performance is achieved in this.
CCCL
“CUDA Core Compute Libraries” is a collection of header-only C++ libraries, which (from README) “provide general-purpose, speed-of-light tools to CUDA C++ developers, allowing them to focus on solving the problems that matter”.
Basically provides abstraction over CUDA.
This was a unification of 3 libraries which were previously present in seperate repos:
- Thrust: Provides C++ functions for various operations (mathematical, algorithms like sort). These functions are called from
host
code of C++ program. - CUB : A set of lower-level CUDA specfic C++ functions. These are called inside the
device
code of C++ program (i.e. inside the kernel) - libudacxx: CUDA C++ Standard Library. It provides an implementation of the C++ Standard Library that works in both host and device code.
MatX
NVIDIA/MatX is a C++17 library for numerical computing.
It provides an API closer to Numpy or Cupy, but in C++.
Python
Allowing Tensor programming
Numerical Computations
More focused on ML application
- Tensorflow
- Pytorch