Hacker News new | past | comments | ask | show | jobs | submit
> I just wish we had "offline" dedupe, or even "lazy" dedupe...

This is the Windows dedupe methodology. I've used it pretty extensively and I'm generally happy with it when the underlying hardware is sufficient. It's very RAM and I/O hungry but you can schedule and throttle the "groveler".

I have had some data eating corruption from bugs in the Windows 2012 R2 timeframe.