Hacker News new | past | comments | ask | show | jobs | submit
> There are no details about training

my understanding was that they are not training at all, which would explain that. they are compiling an interpreter down to a VM that has the shape of a transformer.

ie they are calculating the transformer weights needed to execute the operations of the machine they are generating code for.

loading story #47367764