Hacker News new | past | comments | ask | show | jobs | submit

A case study in PDF forensics: The Epstein PDFs

https://pdfa.org/a-case-study-in-pdf-forensics-the-epstein-pdfs/
loading story #46889343
loading story #46887994
loading story #46888784
loading story #46887081
loading story #46887306
Any guesses why some of the newest files seem to have random ”=” characters in the text? My first thought was OCR, but it seemed to not be linked to characters like ”E” that could be mistakenly interpreted by an OCR tool. My second guess is just making it more difficult to produce reliable text searches, but probably 90% of HN readers could find a way to make a search tool that does not fall apart in case a ”=” character is found (although making this work for long search queries would make the search slower).
loading story #46887523
loading story #46887827
loading story #46891093
loading story #46887362
loading story #46887453
loading story #46887228
loading story #46888748
loading story #46886665
loading story #46888488
loading story #46886488