Tuesday, January 27, 2026

Re: cvsweb redirects to theannoyingsite.com

On Sat, Jan 24, 2026 at 12:09:06PM +0000, requiem. wrote:
> This has come up on the list earlier and sadly no changes were made. The anti-scraping is too aggressive and there is no reason it HAS TO redirect to theannoyingsite. It does not annoy bots, only humans who accidentally get caught by it. I get "trolling botz lol" and fully agree with the need for self-defense but this is trolling the wrong targets.
>
> I would also like to ask once again in line with Strahinja's message to change the redirect to something else.

Sorry if this has been suggested before (but I found no trace).

I run an extremely obscure and insignificant personal gitweb instance
behind my residental connection. The LLM bots came for it regardless
and I tried out several different approaches (IP-based blocks among
other things) to regulate them. What proved the most effective was
implementing the simple trick described in this article:

https://her.esy.fun/posts/0031-how-i-protect-my-forgejo-instance-from-ai-web-crawlers/index.html

Basically, JS + cookie gate.

Most humans use browsers with JS; most bots (including all the LLM
scrapers) do not bother. So this is transparent for actual humans
browsing around. For my case, this approach solved the problem
completely (for now).

/Tom

No comments:

Post a Comment