Story Detail of id 48312684 | Liveview Hacker News

alansaber22 hours ago | on: Claude Opus 4.8

"Our models are more honest" honey the quarterly marketing spin for a ML term has come. Forget "task alignment" now we're going for "truth index". I suppose this is the only way to generate hype when you're selling/releasing the same product over and over again.

TIPSIO21 hours ago | parent | next

When doing some electrical, Opus 4.7 essentially told me to wiggle a wire to see if it was hot or not with my bare hand.

I called it out.

It then gave me one of the most super heartfelt honest and sincere apologies I have ever received.

Glad the safety team was there for me and able to make such an honest model or I would have been very upset about it.

teaearlgraycold20 hours ago | root | parent | next

Opus is so bad at electrical work it's really disappointing. And when it tries to draw schematics as SVGs it's a complete disaster. They should either focus on training their LLMs on this task specifically, or have it refuse.

tclancy19 hours ago | root | parent | next

Hmm, what kind of electrical work? I had it "watch over my shoulder" as I swapped out the pressure switch on our home well and it was a big help. And in the run up to that when I explained opening the 220 box and checking that was "above my paygrade" it limited our investigation to just the less sparky parts.

teaearlgraycold19 hours ago | root | parent

I mean introductory circuit stuff. Not electrician-lite work.

loading story #48321490

loading story #48317327

BoorishBears5 hours ago | root | parent

SVG is like asking an electrician to give you a circuit diagram by painting a watercolor

I'd try something like CircuiTikZ with instructions provided

krupan19 hours ago | root | parent

I honestly cannot tell if you are being sarcastic or not

TIPSIO19 hours ago | root | parent

It did try and lead me to touch a live hot wire once. Thanking the safety team for the honest and sincere apology it gave after was sarcasm.

krupan18 hours ago | root | parent

It tried to get you touch a live wire, then you called it honest and thanked the safety team. It really comes off as sarcastic.

loading story #48321227

doginasuit19 hours ago | parent | next

Credit where it is due, Claude is fantastic at pointing out potential flaws in how I understand the problem based on my question. I asked for this in the system instructions but it is the first model I've tried that does it regularly. It is also so tactful, I feel like I'm learning social skills from a language model. Half of the time it is a false positive due to insufficient context but I still appreciate the additional check.

mrdependable21 hours ago | parent

Gave me wrong information on my very first question. Wasn’t even complicated, and I wasn’t trying to trick it.

#visit	13,436,738
#session	74,665
#live-session	0