100 days of CUDA
Installation instructions (how I do)
- Create a mamba environment
mamba install python=3.12
(pycuda do not work in 3.13 yet)mamba cuda
pip install pycuda
Progress
- Day 0 playing with PyCUDA
- Day 1 playing with NVCC, vector addition
- Day 2 RGB 2 gray
- Day 3 RGB blur
- Day 4 Naive matmul+exercises
- Day 5 Matrix-vecor multiplication
- Day 6 Tiled matmul
- Day 7 Tiled matmul - experiments
- Day 8 Tiled matmul - thread coarsening
- Day 9 Naive conv2d with arbitrary number of channels
- Day 10 faster conv2d
- Day 11 conv2d with shared memory
Some CUDA (or C) quirks to note:
uint32_t a = 1;
int32_t j = -1;
j >= a: True
j + a: 0
Somehow this is how type casting works in C. :/