Hacker News new | past | comments | ask | show | jobs | submit

Introduction to CUDA programming for Python developers

https://www.pyspur.dev/blog/introduction_cuda_programming
Stupid question: Is there any chance that I, as an engineer, can get away from learning the Math side of AI but still drill deeper into the lower level of CUDA or even GPU architecture? If so, how do I start? I guess I should learn about optimization and why we chose to use GPU for certain computations.

Parallel question: I work as a Data Engineer and always wonder if it's possible to get into MLE or AI Data Engineering without knowing AI/ML. I thought I only need to know what the data looks like, but so far I see every job description of an MLE requires background in AI.

Yes. They are largely unrelated. Just go to Nvidia's site and find the docs. Or there are several books (look at amazon).

A "background in AI" is a bit silly in most cases these days. Everyone is basically talking about LLMs or multimodal models which in practice haven't been around long. Sebastian Raschka has a good book about building an LLM from scratch, Simon Prince has a good book on deep learning, Chip Huyen has a good book on "AI engineering". Make a few toys. There you have a "background".

Now if you want to really move the needle... get really strong at all of it, including PTX (nvidia gpu assembly, sort of). Then you can blow people away like the deep seek people did...

loading story #43123398
loading story #43122863
loading story #43122367
The math isn't that difficult. The transformers paper (https://proceedings.neurips.cc/paper_files/paper/2017/file/3...) was remarkably readable for such a high impact paper. Beyond the AI/ML specific terminology (attention) that were thrown out

Neural networks are basically just linear algebra (i.e matrix multiplication) plus an activation function (ReLu, sigmoid, etc.) to generate non-linearities.

Thats first year undergrad in most engineering programs - a fair amount even took it in high school.

I'd like to re-enforce this viewpoint. The math is non-trivial, but if you're a software engineer, you have the skills required to learn _enough_ of it to be useful in the domain. It's a subject which demands an enormous amount of rote learning - exactly the same as software engineering.
hot take: i don't think you even need to understand much linear algebra/calculus to understand what a transformer does. like the math for that could probably be learned within a week of focused effort.
Yeah to be honest its mostly the matrix multiplication, which I got in second year algebra (high school)0.

You don't really need even need to know about determinants, inverting matrices, Gauss-Jordan elimination, eigenvalues, etc. that you'd get in a first year undergrad linear algebra

May I plug-in with ClojureCUDA, a high-level library that lets you write CUDA with almost no overhead, but write it in the interactive Clojure REPL.

https://github.com/uncomplicate/clojurecuda

There's also tons of free tutorials at https://dragan.rocks And a few books! (not free) at https://aiprobook.com

Everything from scratch, interactive, line-by-line, and each line is executed in the live REPL.

Not a stupid question at all! Imo, you can definitely dive deep into CUDA and GPU architecture without needing to be a math whiz. Think of it like this: you can be a great car mechanic without being the engineer who designed the engine.

Start with understanding parallel computing concepts and how GPUs are structured for it. Optimization is key - learn about memory access patterns, thread management, and how to profile your code to find bottlenecks. There are tons of great resources online, and NVIDIA's own documentation is surprisingly good.

As for the data engineering side, tbh, it's tougher to get into MLE without ML knowledge. However, focusing on the data pipeline, feature engineering, and data quality aspects for ML projects might be

loading story #43128988
loading story #43128768
I suggest having a look at https://m.youtube.com/@GPUMODE

They have excellent resources to get you started with Cuda/Triton on top of torch. It also has a good community around it so you get to listen to some amazing people :)

It's definitely possible to focus on the CUDA/GPU side without diving deep into the math. Understanding parallel computing principles and memory optimization is key. I've found that focusing on specific use cases, like optimizing inference, can be a good way to learn. On that note, you might find https://github.com/codelion/optillm useful – it optimizes LLM inference and could give you practical experience with GPU utilization. What kind of AI applications are you most interested in optimizing?
> Math side of AI but still drill deeper into the lower level of CUDA or even GPU architecture

CUDA requires clear understanding of mathematics related to graphics processing and algebra. Using CUDA like you would use traditional CPU would yield abysmal performance.

> MLE or AI Data Engineering without knowing AI/ML

It's impossible to do so, considering that you need to know exactly how the data is used in the models. At the very least you need to understand the basics of the systems that use your data.

Like 90% of the time spent in creating ML based applications is preparing the data to be useful for a particular use case. And if you take Google ML Crash Course, you'll understand why you need to know what and why.

I will provide general advice that applies here, and elsewhere: Start with a project, and implement it, using CUDA. The key will be identifying a problem that is SIMD in nature. Choose something you would normally use a loop for, but that has many (e.g. tens of thousands or more) iterations, which do not depend on the output of the other iterations.

Some basic areas to focus on:

  - Setting up the architecture and config
  - Learning how to write the kernels, and what makes sense for a kernel
  - Learning how the IO and synchronization between CPU and GPU work.
This will be as learning any new programming skill.
IMO absolutely yes. I would start with the linked introduction and then ask myself if I enjoyed it.

for a deeper dive, check out the sth like Georgia Tech’s CS 8803 O21: GPU Hardware and Software.

To get into MLE/AI Data Engineering, I would start with a brief introductory ML course like Andrew Ng’s on Coursera

loading story #43122878
If you want to dive into CUDA specifically then I recommend following some of the graphics tutorials. Then mess around with it yourself, trying to implement any cool graphic/visualization ideas or remixes on the tutorial material.

You could also try to recreate or modify a shader you like from https://www.shadertoy.com/playlist/featured

You'll inevitably pick up some of the math along the way and probably have fun doing it along the way.

Yes, but the problems that need GPU programming also tend to require you to have some understanding of maths. Not exclusively - but it needs to be a problem that's divisible into many small pieces that can be recombined at the end, and you need to have enough data to work through that the compute cost + data transfer cost is much lower than just doing it on CPU.
From an infrastructure perspective, If you have access to the hardware, a fun starting point is running NCCL tests across the infrastructure. Start with a single GPU, then 8 GPUs on a host, then 24 GPU multi hosts over IB or RoCE. You will get a feel for MPI and plenty of knobs to turn on the Kubernetes side.
I mean yes, but without knowing the maths then knowing how to optimize the maths is a bit useless?

At the very least you should know enough linear algebra that you understand scalar, vector and matrix operations against each of the others. You don't need to be able to derive back prop from first principles, but you should know what happens when you multiply a matrix by a vector and apply a non-linear function to the result.

loading story #43123521
I found the gpumode lectures, videos and code right on the money. check them out.
You will probably have fewer job opportunities than the people working higher up, but be safer from AI automation for now :)
Try dipping your toes into graphics programming, you can still use GPUs for that as well.
loading story #43130894
loading story #43121251
loading story #43124165
loading story #43124995
loading story #43132972
loading story #43123759
loading story #43121407
loading story #43123329
loading story #43126067
loading story #43123148
loading story #43123332
loading story #43127929
loading story #43126200
loading story #43122857
loading story #43123342