I will provide general advice that applies here, and elsewhere: Start with a project, and implement it, using CUDA. The key will be identifying a problem that is SIMD in nature. Choose something you would normally use a loop for, but that has many (e.g. tens of thousands or more) iterations, which do not depend on the output of the other iterations.
Some basic areas to focus on:
- Setting up the architecture and config
- Learning how to write the kernels, and what makes sense for a kernel
- Learning how the IO and synchronization between CPU and GPU work.
This will be as learning any new programming skill.