>For us searchable and especially semantically searchable, traces / spans really make sense only in the context of tracing LLM apps.
I know LLM is the new shiny thing right now. Why is semantic search of traces only useful for LLMs?
I've been working in CI/CD and at a large enough scale, searchability of logs was always an issue. Especially as many tools produce a lot of output with warnings and errors that mislead you.
Is the search feature only working in an LLM context? If so why?
Now that you mentioned it,
> warnings and errors that mislead you
it really makes sense. I guess what I was pointing into, is that when you have really rich text (in your case it would be error descriptions), searching over them semantically is a must have feature.
But you are right, being an output of LLM is not a requirement.
I think you might find that gets prohibitively expensive at scale. There's various definitions of "semantic", such as building indexes on OTel semantic conventions, all the way over to true semantic search over data in attributes. I'd be curious how you're thinking about this at the scale of several millions of traces per second.
Hey there, by semantic we mean, embedding text and storing it in the vector DB. Regarding scale, we thought a lot about it, and that's why we process span in the background queue. Tradeoff would be that indexing / embedding would not be real-time as scale will grow. We will also use tiny embedding models, which become better and better.