Story Detail of id 47420557 | Liveview Hacker News

andai19 hours ago | on: Mistral AI Releases Forge

They mention pretraining too, which surprises me. I thought that was prohibitively expensive?

It's feasible for small models but, I thought small models were not reliable for factual information?

simsla17 hours ago | parent

Typical stages of training for these models are:

Foundational:

- Pretraining - Mid/post-training (SFT) - RLHF or alignment post-training (RL)

And sometimes...

- Some more customer-specific fine-tuning.

Note that any supervised fine-tuning following the Pretraining stage is just swapping the dataset and maybe tweaking some of the optimiser settings. Presumably they're talking about this kind of pre-RL fine-tuning instead of post-RL fine-tuning, and not about swapping out the Pretraining stage entirely.

#visit	13,160,658
#session	74,665
#live-session	0