Story Detail of id 47388007 | Liveview Hacker News

mtrovo5 hours ago | on: The 100 hour gap between a vibecoded prototype and a working product

I think the main issue is treating LLM as a unrestrained black box, there's a reason nobody outside tech trust so blindly on LLMs.

The only way to make LLMs useful for now is to restrain their hallucinations as much as possible with evals, and these evals need to be very clear about what are the goal you're optimizing for.

See karpathy's work on the autoresearch agent and how it carry experiments, it might be useful for what you're doing.

riffraff5 hours ago | parent | next

> there's a reason nobody outside tech trust so blindly on LLMs.

Man, I wish this was true. I know a bunch of non tech people who just trusts random shit that chatgpt made up.

I had an architect tell me "ask chatgpt" when I asked her the difference between two industrial standard measures :)

We had politicians share LLM crap, researchers doing papers with hallucinated citations..

It's not just tech people.

withinboredom3 hours ago | root | parent | next

We were working on translations for Arabic and in the spec it said to use "Arabic numerals" for numbers. Our PM said that "according to ChatGPT that means we need to use Arabic script numbers, not Arabic numerals".

It took a lot of back-and-forths with her to convince her that the numbers she uses every day are "Arabic numerals". Even the author of the spec could barely convince her -- it took a meeting with the Arabic translators (several different ones) to finally do it. Think about that for a minute. People won't believe subject matter experts over an LLM.

We're cooked.

tstenner1 hour ago | root | parent | next

The architect should have required Hindu numbers. Same result, but even more confusion.

dvfjsdhgfv1 hour ago | root | parent

Man this is maddening.

5 hours ago | root | parent | next

{"deleted":true,"id":47388583,"parent":47388257,"time":1773589964,"type":"comment"}

roncesvalles2 hours ago | root | parent

And the worst part is, these people don't even use the flagship thinking models, they use the default fast ones.

closewith2 hours ago | parent

In my experience, people outside of tech have nearly limitless faith in AI, to the point that when it clashes with traditional sources of truth, people start to question them rather than the LLM.

#visit	13,118,595
#session	74,665
#live-session	0