Understanding is not consciousness.
Their training is all about understanding. There is nothing in their architecture or training that credibly optimizes for rich self-awareness.
Given non-persistent experience, non-continuous operation, no ability to build up generalizations and aggregate experience of their own self-awareness over time, they seem to be structurally designed to not have consciousness.
This is a case where acting is very credible. Understanding of other's consciousness, in a functional and third party sense, isn't a substrate for personal experience.
In stark contrast, humans develop consciousness gradually over continuous time with persistent aggregation of experience. By the time we can recognize our own consciousness in the abstract, and reason about it, we have had it for some time.
My point is that the fact that AI can reproduce convincingly human sentence continuation does not imply that the AI has no choice but ending up using a mechanism that "understand" rather than just have learned data patterns that are very effective to fake human sentence continuation but are meaningless in term of understanding the concepts.
And I think that if indeed the only way for AI to reproduce convincingly human sentence continuation would be to end up in a configuration that uses the "understand" mechanism to do so, the behaviour of the first LLM would not show that they are so good at sounding human and yet so bad at failing basic "understanding" tests.
Taken as an absolute without any addition context you are right.
But we are not talking about abstractions but specific successful models. The number of parameters models they have may seem large, but they are very small relative to the training data that they have to summarize. That cannot do it without discovering that patterns that make sense out of it.
And we can verify that. Simply discuss completely disparate topics, with some kind of intersection. Converge several highly unlikely topics, there are so many it would take billions of years to exhaust unlikely combinations.
If the model is only interpolating it will produce gibberish.
But that isn't what happens.
The fact that models can be near expert, and sometimes expert, across vast areas of human knowledge is a clue. If they don't understand that, then the question is, why do we think people understand things. Does having an answer mean a human understands something, or is their intuition and stream of conscious reasoning also not understanding? To be even handed about what we mean by understanding.
Sometimes, a problem being hard means you only get bad solutions, or increasingly accurate ones.
The planet isn't big enough for the proverbial interpolative stochastic parrot, over the training set of global human communication.
You are basically arguing for a functional account of consciousness, but things like this have been debated for literally decades/centuries in philosophy.
The problem with the hallucination argument is (1) that is much less of a problem with good current generation AIs, and (2) living conscious breathing human beings also have a disturbing tendency to make shit up, too. So a tendency to make stuff up doesn't really serve as a disqualifier for consciousness.
Also worth mentioning that the guiding rule of what's philosophical or not is whether it's actually useful. Actually useful philosophy usually becomes something else. Usual some scientific discipline or another. And as it turns out, theories of mind are likely to become extremely useful in the near future. Expect huge advances!
I am far from convinced that the training and inference regimes of LLMs would qualify as “experience” by any sense of the word.
Now, if we hooked up a plethora of audiovisual and tactile sensors with live feedback directly to a neural network rich with transformers, that was always powered on and fully autonomous, we may be getting there. But we’d probably also be on the verge of manmade horrors beyond our comprehension.
Biological rodent neural networks in a Petri dish stimulated by electrical impulses - more or less conscious than LLMs?
Human on life support, unable to respond to any external stimuli, “braindead” - more or less conscious than LLMs?
And, yes, concerns about whether biological rodent neural networks are or are not conscious come up frequently in the biological neural network papers. I'm not sure I would want to be a researcher trying to get an experiment past an ethics committee if my biological neural network had 25B rat neurons. (I would hope that they could not).