Story Detail of id 41465106 | Liveview Hacker News

account424 months ago | on: The Internet Archive has lost its appeal in Hachette vs. Internet Archive

Yes, that seems to be a silly way to go about it if your goal is to store the whole web and not just a single scrape. Of course anything that deduplicates data is more vulnerable to data corruption (or at least corruption can have wider consequences) so it's not a trivial problem but you'd think deduplicating identical resources would be something added the first time they came close to their storage limits.

#visit	11478970
#session	45277
#live-session	0