May I plug-in with ClojureCUDA, a high-level library that lets you write CUDA with almost no overhead, but write it in the interactive Clojure REPL.
https://github.com/uncomplicate/clojurecuda
There's also tons of free tutorials at https://dragan.rocks And a few books! (not free) at https://aiprobook.com
Everything from scratch, interactive, line-by-line, and each line is executed in the live REPL.