Show HN: Laminar – Open-Source DataDog + PostHog for LLM Apps, Built in Rust
https://github.com/lmnr-ai/lmnrHow can adding analytics to a system that is designed to act like humans produce any good? What is the goal here? Could you clarify why would some need to analyze LLMs out of all the things?
> Rich text data makes LLM traces unique, so we let you track “semantic metrics” (like what your AI agent is actually saying) and connect those metrics to where they happen in the trace
But why does it matter? Because at the current state these are muted LLMs overseen by the big company. We have very little to control the behavior and whatever we give it, it will mostly be 'politically' correct.
> One thing missing from all LLM observability platforms right now is an adequate search over traces.
Again, why do we need to evaluate LLMs? Unless you are working in a security, I see no purpose because these models aren't as capable as they used to be. Everything is muted.
For context: I don't even need to prompt engineer these days because it just gives similar result by using the default prompt. My prompts these are literally three words because it gets more of the job done that way than giving elaborate prompt with precise example and context.
> Could you clarify why would some need to analyze LLMs out of all the things?
When you want to understand trends of the output of your Agent / RAG on scale, without looking manually at each trace, you need to another LLM to process the output. For instance, you want to understand what is the most common topic discussed with your agent. You can prompt another LLM to extract this info, Laminar will host everything, and turn this data into metrics.
> Why do we need to evaluate LLMs?
You right, devs who want to evaluate output of the LLM apps, truly care about the quality or some other metric. For this kind of cases evals are invaluable. Good example would be, AI drive-through agents or AI voice agents for mortgages (use cases we've seen on Laminar)
I see that you have chained prompts, does that mean I can define agents and functions inside the platform without having it in the code?
* Ingestion of Otel traces
* Semantic events-based analytics
* Semantically searchable traces
* High performance, reliability and efficiency out of the box, thanks to our stack
* High quality FE which is fully open-source
* LLM Pipeline manager, first of it's kind, highly customizable and optimized for performance
* Ability to track progression of locally run evals, combining full flexibility of running code locally without need to manage data infra
* Very generous free tier plan. Our infra is so efficient, that we can accommodate large number of free tier users without scaling it too much.
And many more to come in the coming weeks! On of our biggest next priorities is to focus on high quality docs.
All of these features can be used as standalone products, similar to Supabase. So, devs who prefer keep things lightweight might just use our tracing solution and be very happy with it.
I really like the stack these folks have chosen.
Why did you decide to build a whole platform and include this feature on top, rather than adding search to (for example) Grafana Tempo?
I know LLM is the new shiny thing right now. Why is semantic search of traces only useful for LLMs?
I've been working in CI/CD and at a large enough scale, searchability of logs was always an issue. Especially as many tools produce a lot of output with warnings and errors that mislead you.
Is the search feature only working in an LLM context? If so why?
it really makes sense. I guess what I was pointing into, is that when you have really rich text (in your case it would be error descriptions), searching over them semantically is a must have feature.
But you are right, being an output of LLM is not a requirement.
We love it because we tried putting things into the UI, but found it to be much more limiting rather that letting users design evals and run them however they want.
We really like langfuse, the team and the product.
Compared to it:
* We send and ingest Otel traces with GenAI semconv
* Provide semantic-event based analytics - you actually can understand what's happening with your LLM app, not just stare at the logs all day.
* Laminar is built be high-performance and reliable from day 0, easily ingesting and processing spikes of 500k+ tokens per seconds
* Much more flexible evals, because you execute everything locally and simply store the results on Laminar
* Go beyond simple prompt management and support Prompt Chain / LLM pipeline management. Extremely useful when you want to host something like Mixture of Agents as a scalable and trackable micro-service.
* It's not released yet, but searchable trace / span data