Show HN: Omni – Open-source workplace search and chat, built on Postgres

141prvnsmpth | 13 hours ago | 39 | HN

loading story #47220435

* "Self-hosted: Runs entirely on your infrastructure. No data leaves your network."

* "Bring Your Own LLM: Anthropic, OpenAI, Gemini, or open-weight models via vLLM."

With so many newbies wanting these kinds of services it might be worth adjusting the first bullet to say: "No data leaves your network, at least as long as you don't use any Anthropic, OpenAI, or Gemini models via the network of course"

loading story #47217243

loading story #47217060

zaphoyd6 hours ago | parent | next

How are you managing multiplayer and permissions? I see in the docs that you can add multiple users and that queries are filtered by the requesting user such that the user only sees what they have access to. The docs aren't particularly clear on how this is being accomplished.

Does each user do their own auth and the ingest runs for each user using stored user creds, perhaps deduplicating the data in the index, but storing permissions metadata for query time filtering?

Or is there a single "team" level integration credential that indexes everything in the workspace and separately builds a permissions model based on the ACLs from the source system API?

loading story #47220063

PhilippGille8 hours ago | parent | next

How does it compare to Onyx (rebranded from Danswer, with more chat focus, while Danswer was more RAG focus on company docs/comms)?

- https://onyx.app/

- Their rebranded Onyx launch: https://news.ycombinator.com/item?id=46045987

- Their orignal Danswer launch: https://news.ycombinator.com/item?id=36667374

loading story #47217577

Doublon12 hours ago | parent | next

Interesting!

I also started to build something similar for us, as an PoC/alternative to Glean. I'm curious how you handle data isolation, where each user has access to just the messages in their own Slack channels, or Jira tickets from only workspaces they have access to? Managing user mapping was also super painful in AWS Q for Business.

loading story #47215803

swaminarayan12 hours ago | parent | next

How well does the Postgres-only approach hold up as data grows — did you benchmark it against Elasticsearch or a dedicated vector DB?

loading story #47215920

loading story #47219937

loading story #47216516

keyle11 hours ago | parent | next

I've done some RAG using postgres and the vector db extension, look into it if you're doing that type of search; it's certainly simpler than bolting another solution for it.

loading story #47216023

Lapalux9 hours ago | parent | next

Can it connect to Teams?

loading story #47217128

loading story #47217058

andai9 hours ago | parent | next

Nice! Could you elaborate on "not just a basic RAG"?

loading story #47217092

11 hours ago | parent | next

{"deleted":true,"id":47215891,"parent":47215427,"time":1772445731,"type":"comment"}

vladdoster11 hours ago | parent | next

Multiple pages link to a `API Reference` that returns a 404

loading story #47215956

jFriedensreich9 hours ago | parent | next

Can we please not change the meaning of chat to mean agent interface? It was painful to see crypto suddenly meaning token instead if cryptography. Plus i really dont want to “chat” with ai. its a textual interface

loading story #47217621

loading story #47222255

octoclaw7 hours ago | parent | next

[dead]

loading story #47220677

octoclaw11 hours ago | parent | next

[dead]

loading story #47222299

shablulman12 hours ago | parent | next

[dead]

mickelsamuel6 hours ago | parent

Postgres as a search backend is one of those decisions that looks wrong on paper but works really well in practice. tsvector handles full-text, pg_trgm does fuzzy matching, pgvector covers semantic — and you don't need to babysit an Elasticsearch cluster or worry about sync lag.

The part that's easy to overlook: your search index is transactionally consistent with everything else. No stale results because some background sync job fell over at 3am.

With 3000+ schemas I'd keep an eye on GIN index bloat. The per-index overhead across that many schemas adds up and autovac has trouble keeping pace.

loading story #47219366

#visit	12,951,698
#session	74,665
#live-session	0