Valid point. For us searchable and especially semantically searchable, traces / spans really make sense only in the context of tracing LLM apps. And then, we view it as a powerful feature, but, primarily in the context of AI/LLM-native observability platform. For us, the ultimate goal is to build the comprehensive platform, with features which are extremely useful for observability and development of LLM/GenAI apps.
>For us searchable and especially semantically searchable, traces / spans really make sense only in the context of tracing LLM apps.
I know LLM is the new shiny thing right now. Why is semantic search of traces only useful for LLMs?
I've been working in CI/CD and at a large enough scale, searchability of logs was always an issue. Especially as many tools produce a lot of output with warnings and errors that mislead you.
Is the search feature only working in an LLM context? If so why?
Now that you mentioned it,
> warnings and errors that mislead you
it really makes sense. I guess what I was pointing into, is that when you have really rich text (in your case it would be error descriptions), searching over them semantically is a must have feature.
But you are right, being an output of LLM is not a requirement.
I think you might find that gets prohibitively expensive at scale. There's various definitions of "semantic", such as building indexes on OTel semantic conventions, all the way over to true semantic search over data in attributes. I'd be curious how you're thinking about this at the scale of several millions of traces per second.
Hey there, by semantic we mean, embedding text and storing it in the vector DB. Regarding scale, we thought a lot about it, and that's why we process span in the background queue. Tradeoff would be that indexing / embedding would not be real-time as scale will grow. We will also use tiny embedding models, which become better and better.