Sunday, September 15, 2024

Re: checksums to detect/correct bit-rot

On Sun, Sep 15, 2024 at 12:22 AM Jonathan Thornburg
<dr.j.thornburg@gmail.com> wrote:
>
> Does OpenBSD support any file systems with built-in checksums to
> (try to) ensure metadata and/or data integrity in the face of "bit rot"
> disk (or memory/cpu/USB) errors? I'm not looking for ZFS-style storage
> pools or logical volume management, "just" checksums to catch silent
> metadata and/or data corruption.
>
> Softraid 1, 5, or 1C could in theory do this, but with a large space
> overhead (a factor of 2 to detect errors, or 3 to correct errors).
> And, the current (7.5) man pages don't mention any option to have each
> read read all the chunks and verify that they're identical.
>
> And a related question: I have a pool of ~10 external USB3 backup
> disks (all consumer-grade WD or Seagate 2.5" spinning rust, either
> 2TB or 4TB capacity each), all currently setup with FFS2 filesystems
> on top of softraid crypto (/bioctl -c C/). Each backup is to a single
> disk, written with (roughly speaking)
> rsync -aHESvv --delete /home/ /mnt/home/
> Each disk thus has slightly different contents depending on how
> recently I did a backup to that disk, but the vast majority of the
> files (those that haven't changed recently) should be identical
> across disks.
>
> [Before anyone asks: Yes, I regularly rotate some of the disks offsite.
> And yes, I regularly restore files "in anger".]
>
> Each backup disk somewhat more than 1e13 bits, so at an unrecoverable
> bit error rate of 1e-14 or 1e-15 for consumer disks there's a non-trivial
> chance of a bit error somewhere in my backup pool.
>
> Thinking about how to detect/correct bit-rot in these backups, it
> occurs to me that I could hack up some Perl to walk the filesystem
> tree on a mounted backup disk, /stat()/ and read each file, and build
> a database of (pathname, inode mtime, checksum) tuples. (I could either
> ignore symlinks, or checksum the result of /readlink()/.) Then given
> such databases for a bunch of disks, a bit more Perl could read all
> the databases, find all the files with matching pathname and inode
> mtime (so that the contents should be the same, given that my usage
> of /rsync/ preserves /mtime/), look for differing checksums, and for
> any differences, majority-vote the checksums to identify which copy
> or copies is in error.
>
> But before I reinvent the wheel, can anyone point me to software
> which already does this? Bonus points if the software is already
> in ports.

perhaps you would find mtree(8) helpful.

>
> Thanks,
> --
> -- "Jonathan Thornburg [remove -color to reply]" <dr.j.thornburg@gmail-pink.com>
> on the west coast of Canada
> "The programmers outside looked from Web 2.0 firm to AI company, and from
> AI company to Web 2.0 firm, and from Web 2.0 firm to AI company again;
> but already it was impossible to say which was which."
> -- /Ars Technica/ comment by /ubercurmudgeon/, 2024-05-09
> >
>

No comments:

Post a Comment