It is not great at decision making or judgment calls that don't have a well defined spec or plan in place yet; like unofficial or unapproved tokens if you will. A lot of this stuff simply never has had specs as it has been internal to how companies work and their secret sauce.
The closest thing we have are governance and compliance policies due to legal/business needs requiring it so it's far more well documented than operational ones in how we work. It is more about the how versus the what here I guess is what I'm saying.
But yeah this is why it does great when there are tests, design systems, evals, and other artifacts to mirror. Far more reckless and unpredictable without these things, but still great for exploration and finding the data output you seek.
It's like when I see people feeding it a whole bunch of "best practices" and expect it to follow them. It won't. But you could ask it questions about the best practices all day long.
Idk, calling it "just text prediction " seems unfairly dismissive of this capability
at the end of the day, it presents a vector field and predicts the next vector. That’s literally the heart of intelligence just like assembly is the heart of execution. When playing table tennis, your brain is literally predicting seconds into the future to get your body into the right position.
But we aren’t discussing intelligence here. We are discussing how best to utilize that intelligence.