Show HN: A blocklist to remove spam and bad websites from search results
https://github.com/popcar2/BadWebsiteBlocklistThe problem seems worse on "alternative" search engines, e.g. DuckDuckGo and Kagi, which both use Bing. It's been driving me back to Google.
A blocklist seems like a losing proposition, unless, like adblock filter lists, it balloons to tens of thousands of entries and gets updated constantly.
Unfortunately, this kind of blocklist is highly subjective. This list blocks MSN.com! That's hardly what I would have chosen.
It ironically makes me think of the Yahoo Web Directory in the 90s.
Time is a flat circle.
Another great function (not for this plugin) should be the option to "bundle" all search results from the same domain. Stuff them under one collapsible entry. I hate going through lists and pages of apple/google/synology/sonos/crab urls when I already know that I have to search somewhere else.
Not saying you should, just that you could...
I may do that.
It’s the same reason why social media blocklists can be problematic—everyone’s calculus is different.
My suggestion is that you promote it as a starter and suggest that users fork it for their own needs.
also works well with Pi-hole and other platforms.
https://github.com/spmedia/Crypto-Scam-and-Crypto-Phishing-T...
- For example, kaspersky blog doesn't look bad.
- CCleaner blog is just a list of update.
She talks at length about how pages of AI-generated nonsense text are cluttering search results on Google and all other search engines.
The scalability comes from the caching inherent in DNS; instead of having to have millions of people downloading text files from a website over HTTP on a regular basis, the data is in effect lazy-uploaded into the cloud of caching DNS resolvers, with no administration cost on behalf of the DNSBL operator.
Reputation whitelists (or other scoring services) would also be just as easy to implement.
Some sites are complete garbage and should be blocked, for course. Others (e.g., in my experience, Quora) are sometimes quite good and sometimes quite bad. Wouldn't be my first choice, but I've found them useful at times.
For a given search, maybe you try with the most aggressive blocking / filtering. If you fail to find what you're looking for, maybe soften the restriction a bit.
Maybe this is overwrought...
SEO spam and AI slop are easily spotted on the human level. Google has hundreds of thousands of employees. Just put ONE of them on this f**ing job!
It's criminal what these companies have let happen to the web.
I use a VM in other scenarios but even that, properly separated?
Do you have a forum where you discuss prospective contributions etc?