Hacker News new | past | comments | ask | show | jobs | submit
The deeper reason agents write good Bonsai_term code is that the entire UI renders as plain text, so a screenshot test is just a diff the model can read and verify on its own. A GUI's visual state needs a vision model to inspect, but a TUI's output already lives in the agent's native modality, which closes the feedback loop for free.
loading story #48373380
for snapshot tests it seems better to diff a data representation such as some yaml string, than to diff UIs
The whole UI seems better for LLMs to consume and also displays nicely in-editor for humans. Test failures become failing screenshot tests essentially, which are really comfortable changes to review.