Tuesday, March 10, 2026

Re: cvsweb news

On 3/10/26 11:37 AM, Constantine A. Murenin wrote:
[... blah ... blah ... blah ...]

you made arguments for why it is "easy".
You did not make any consideration for the CONSEQUENCES of your "easy"
solutions.

The old cvsweb app made it trivial to scrape not only every single file, but
every single version of every single file, every single incremental diff of
every file and /every single diff between every two versions of every file/.
So... for a file in CVS with N commits, there are N*(N-1) diffs possible.
And the old version of cvsweb exposed all of those...for every single file in
CVS. They already have a list of all these possible diffs, and they are
attempting to use them.

Every diff requested requires firing up external applications. That's a lot
of load, even for a significantly more efficient application.

Those requests are still coming in. Yesterday, well over 90% of the queries
we got were URLs from the old application. So by returning a 404 instead of
firing up cvs, co, and/or rcsdiff every time a bot query comes in, we save a
LOT of load on the system. That's a HUGE win.

This win has enabled me to remove the IP filters, I've removed much of the
malicious request handling which was all justifiably highly unpopular and
(unfortunately) hurting some of the legitimate users. I hope to soon return
the systems this application runs on to a fully redundant CARP pair (due to
the load, I had to "split" the cvsweb off to its own machine, so lost the
redundancy). Lots of "win" here.

So...unless the OpenBSD developers request otherwise, I do think we will not
be worrying about -- and in fact, actively discouraging -- the old URLs. I get
it, less than optimal. But this whole problem has been a gigantic "less than
optimal" that shouldn't be, but it is, and we deal with it as best we can with
the resources that we have available. As far as Ken and I are concerned at this
point, the discussion of supporting old URL is over.

(And special thanks to my employer for laying me off at an opportune time where
I could devote a fair chunk of time to work with Ken at getting his new solution
up and running!)

Nick.

No comments:

Post a Comment