Show HN: Omni – Open-source workplace search and chat, built on Postgres
https://github.com/getomnico/omni* "Bring Your Own LLM: Anthropic, OpenAI, Gemini, or open-weight models via vLLM."
With so many newbies wanting these kinds of services it might be worth adjusting the first bullet to say: "No data leaves your network, at least as long as you don't use any Anthropic, OpenAI, or Gemini models via the network of course"
Does each user do their own auth and the ingest runs for each user using stored user creds, perhaps deduplicating the data in the index, but storing permissions metadata for query time filtering?
Or is there a single "team" level integration credential that indexes everything in the workspace and separately builds a permissions model based on the ACLs from the source system API?
- Their rebranded Onyx launch: https://news.ycombinator.com/item?id=46045987
- Their orignal Danswer launch: https://news.ycombinator.com/item?id=36667374
I also started to build something similar for us, as an PoC/alternative to Glean. I'm curious how you handle data isolation, where each user has access to just the messages in their own Slack channels, or Jira tickets from only workspaces they have access to? Managing user mapping was also super painful in AWS Q for Business.
The part that's easy to overlook: your search index is transactionally consistent with everything else. No stale results because some background sync job fell over at 3am.
With 3000+ schemas I'd keep an eye on GIN index bloat. The per-index overhead across that many schemas adds up and autovac has trouble keeping pace.