Friday, July 05, 2024

Re: smtpd warn: not enough disk space

On 05.07.24 13:46, Jeremy Mates wrote:
> On 2024-07-05 05:19:01 +0200, Christian Schulte wrote:
>> I have never seen an application performing such kind of checks.
>
> Sendmail had a knob to refuse mail at a certain CPU load, on the
> assumption that if a system was "too busy" it's in a bad state and
> accepting mail would only make things worse. Also rogue(6) had CPU
> checks, apparently they wanted to stop any rogue processes when the
> system load got too high--maybe real work was being done? Speaking of
> not checking the CPU, I have an amusing story of what a dual-processor
> Linux box did at a CPU load of 5,000, something something sending the
> wrong bytes to the wrong file descriptors, thus corrupting payment
> transactions for a small internet retailer.
>
> For application-specific filesystem checks I don't recall any off the
> top of my head, but I tend to have monitoring send dire alerts when the
> disk usage hits 90% (or less, if you want more breathing room to figure
> out what is wrong, who to notify, them to act, etc) so would never have
> seen anything that triggers at 5% free or whatever.
>
>> If there is not enough space, write would fail anyway.
>
> Or the filesystem gets somewhat looped after being run at 100% full for
> too long. Granted, that was Windows, and we did warn the user not to do
> that, but they did, and then when a nice Windows admin tried to backup
> the 500G disk, it filled up a 3T disk and wanted more. Who knows what
> breaks when a stray dd(1) or package update or whatever eats too much of
> the remaining space. Not breaking randomly is probably a design goal of
> a SMTP server that tries to guarantee delivery? You could fiddle with
> the percentage, or remove the code, but I'm going to guess that most
> everyone else keeps their mail partitions not so full.

I just commented out the function call locally for my laptop. The issue
had been discussed on the OpenSMTPD mailing list as well. As it seems,
everyone seems to agree, that checking a fixed percentage value makes no
sense. I do not disagree that checking/preserving some amount of free
disk space makes sense. For example, I ran into an out of disk space
situation on a server running postgresql filling up the disk 100% and
then crashing. It's a personal VPS server I am not maintaining
professionally. A database admin would have monitored the system and
just enhanced storage when required. Bad thing for me was, that I could
not vaccuum the database, because postgresql copies tables to new files
to reclaim disk space afterwards. So the database was unusable for about
12 hours, until I could manage to xz a backup and transfer it to another
machine. Making postgresql aware of keeping some free disk space, it
would need to preserve around 50%, so that vaccuum can always be
performed. That would be just impractical. My objection is that
something like this should not be coded into the application itself, but
rather is a task of general system maintenance (e.g. setting up
filesystem quotas and such). For what it's worth, it's just my private
laptop...

Regards,
--
Christian

No comments:

Post a Comment