Reinforcement learning changes the model. So it can and does change and remember based on experience. Eventually reinforcement learning can happen in real time.
But is the model aware of the training? Unless you hook the model up to an MCP server, or something similar, and have it analyze the RL changes, it will not know if it has changed or not. Even if it is real-time RL, it is not aware of the previous state.
Why not? Why can’t part of its previous state be part of the training?
Are you aware of each of your dreams from last week or last night even?