Work duration is also not that valuable of a measure, you're usually better off defining the process yourself in code and having that delegate chunks of work to the models. The only real issue there is that it's harder to take advantage of the providers' subscription discounts, but on the other hand it's easier to do your own model routing, and there's no way I've seen for the normal chatbots to maintain coherence on streams of work measured in days and weeks.