Show HN: Value likelihoods for OpenAI structured output
https://arena-ai.github.io/structured-logprobs/That said, it's concerning to see the reported probability for getting a 4 on a die roll is 65%.
Hopefully OpenAI isn't that biased at generating die rolls, so is that number actually giving us information about the accuracy of the probability assessments?
https://news.ycombinator.com/item?id=42684629
> the logits aren't telling you anything like 'what is the probability in a random sample of Internet text of the next token', but are closer to a Bellman value function, expressing the model's belief as to what would be the net reward from picking each possible BPE as an 'action' and then continuing to pick the optimal BPE after that (ie. following its policy until the episode terminates). Because there is usually 1 best action, it tries to put the largest value on that action, and assign very small values to the rest (no matter how plausible each of them might be if you were looking at random Internet text)
Any interest in seeing this sort of thing being added to llama.cpp?
Any chance we can get Pydantic support?
Also, if you're "studying LLM based chess" and you don't use dynamic grammar's to enforce that models can only make "valid" moves at each time step, you're research is basically invalid.
And don't meme me with claims that structured/constrained generation harms creativity. The devs of outlines debunked that FUD already: https://blog.dottxt.co/say-what-you-mean.html
Similarly, if you think that RLHF/DPO or Lora or any of that harms creativity, you're really outing yourself as not having played with high temperature sampling.