Hacker News new | past | comments | ask | show | jobs | submit
I tried an experiment with this using a Prolog interpreter with GPT-4 to try to answer complex logic questions. I found that it was really difficult because the model didn't seem to know Prolog well enough to write a description of any complexity.

It seems like you used an interpreter in the loop which is likely to help. I'd also be interested to see how o1 would do in a task like this or if it even makes sense to use something like prolog if the models can backtrack during the "thinking" phase

I also wrote wrote an LLM to Prolog interpreter for a hackathon called "Logical". With a few hours effort I'm sure it could be improved.

https://github.com/Hendler/logical

I think while LLMs may approach completeness here, it's good to have an interpretable system to audit/verify and reproduce results.

I bet one person could probably build a pretty good synthetic NL->Prolog dataset. ROI for paying that person would be high if you were building a foundation model (ie benefits beyond being able to output Prolog.)
loading story #41875620