Hacker News new | past | comments | ask | show | jobs | submit

Document poisoning in RAG systems: How attackers corrupt AI's sources

https://aminrj.com/posts/rag-document-poisoning/
The "requires write access" framing undersells the risk. Most production RAG pipelines don't ingest from a single curated database — they crawl Confluence, shared drives, Slack exports, support tickets. In a typical enterprise, hundreds of people have write access to those sources without anyone thinking of it as "write access to the knowledge base."

The PoisonedRAG paper showing 90% success at millions-of-documents scale is the scary part. The vocabulary engineering approach here is basically the embedding equivalent of SEO — you're just optimizing for cosine similarity instead of PageRank. And unlike SEO, there's no ecosystem of detection tools yet.

I'd love to see someone test whether document-level provenance tracking (signing chunks with source metadata and surfacing that to the user) actually helps in practice, or if people just ignore it like they ignore certificate warnings.

> Low barrier to entry. This attack requires write access to the knowledge base,

this is the entire premise that bothers me here. it requires a bad actor with critical access, it also requires that the final rag output doesn't provide a reference to the referenced result. Seems just like a flawed product at that point.

loading story #47358023
loading story #47357786
loading story #47357721