Sunday, November 15, 2020

Re: Large Filesystem

On 15 Nov at 14:52, Otto Moerbeek <otto@drijf.net> wrote:
> On Sun, Nov 15, 2020 at 02:43:03PM +0100, Mischa wrote:
>
> > On 15 Nov at 14:25, Otto Moerbeek <otto@drijf.net> wrote:
> > > On Sun, Nov 15, 2020 at 02:14:47PM +0100, Mischa wrote:
> > >
> > > > On 15 Nov at 13:04, Otto Moerbeek <otto@drijf.net> wrote:
> > > > > On Sat, Nov 14, 2020 at 05:59:37PM +0100, Otto Moerbeek wrote:
> > > > >
> > > > > > On Sat, Nov 14, 2020 at 04:59:22PM +0100, Mischa wrote:
> > > > > >
> > > > > > > On 14 Nov at 15:54, Otto Moerbeek <otto@drijf.net> wrote:
> > > > > > > > On Sat, Nov 14, 2020 at 03:13:57PM +0100, Leo Unglaub wrote:
> > > > > > > >
> > > > > > > > > Hey,
> > > > > > > > > my largest filesystem with OpenBSD on it is 12TB and for the minimal usecase
> > > > > > > > > i have it works fine. I did not loose any data or so. I have it mounted with
> > > > > > > > > the following flags:
> > > > > > > > >
> > > > > > > > > > local, noatime, nodev, noexec, nosuid, softdep
> > > > > > > > >
> > > > > > > > > The only thing i should mention is that one time the server crashed and i
> > > > > > > > > had to do a fsck during the next boot. It took around 10 hours for the 12TB.
> > > > > > > > > This might be something to keep in mind if you want to use this on a server.
> > > > > > > > > But if my memory serves me well otto did some changes to fsck on ffs2, so
> > > > > > > > > maybe thats a lot faster now.
> > > > > > > > >
> > > > > > > > > I hope this helps you a little bit!
> > > > > > > > > Greetings from Vienna
> > > > > > > > > Leo
> > > > > > > > >
> > > > > > > > > Am 14.11.2020 um 13:50 schrieb Mischa:
> > > > > > > > > > I am currently in the process of building a large filesystem with
> > > > > > > > > > 12 x 6TB 3.5" SAS in raid6, effectively ~55TB of storage, to serve as a
> > > > > > > > > > central, mostly download, platform with around 100 concurrent
> > > > > > > > > > connections.
> > > > > > > > > >
> > > > > > > > > > The current system is running FreeBSD with ZFS and I would like to
> > > > > > > > > > see if it's possible on OpenBSD, as it's one of the last two systems
> > > > > > > > > > on FreeBSD left.:)
> > > > > > > > > >
> > > > > > > > > > Has anybody build a large filesystem using FFS2? Is it a good idea?
> > > > > > > > > > How does it perform? What are good tests to run?
> > > > > > > > > >
> > > > > > > > > > Your help and suggestions are really appriciated!
> > > > > > > > >
> > > > > > > >
> > > > > > > > It doesn't always has to be that bad, on current:
> > > > > > > >
> > > > > > > > [otto@lou:22]$ dmesg | grep sd[123]
> > > > > > > > sd1 at scsibus1 targ 2 lun 0: <ATA, ST16000NE000-2RW, EN02> naa.5000c500c3ef0896
> > > > > > > > sd1: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > > > > > > > sd2 at scsibus1 targ 3 lun 0: <ATA, ST16000NE000-2RW, EN02> naa.5000c500c40e8569
> > > > > > > > sd2: 15259648MB, 512 bytes/sector, 31251759104 sectors
> > > > > > > > sd3 at scsibus3 targ 1 lun 0: <OPENBSD, SR RAID 0, 006>
> > > > > > > > sd3: 30519295MB, 512 bytes/sector, 62503516672 sectors
> > > > > > > >
> > > > > > > > [otto@lou:20]$ df -h /mnt
> > > > > > > > Filesystem Size Used Avail Capacity Mounted on
> > > > > > > > /dev/sd3a 28.9T 5.1G 27.4T 0% /mnt
> > > > > > > >
> > > > > > > > [otto@lou:20]$ time doas fsck -f /dev/rsd3a
> > > > > > > > ** /dev/rsd3a
> > > > > > > > ** File system is already clean
> > > > > > > > ** Last Mounted on /mnt
> > > > > > > > ** Phase 1 - Check Blocks and Sizes
> > > > > > > > ** Phase 2 - Check Pathnames
> > > > > > > > ** Phase 3 - Check Connectivity
> > > > > > > > ** Phase 4 - Check Reference Counts
> > > > > > > > ** Phase 5 - Check Cyl groups
> > > > > > > > 176037 files, 666345 used, 3875083616 free (120 frags, 484385437
> > > > > > > > blocks, 0.0% fragmentation)
> > > > > > > > 1m47.80s real 0m14.09s user 0m06.36s system
> > > > > > > >
> > > > > > > > But note that fsck for FFS2 will get slower once more inodes are in
> > > > > > > > use or have been in use.
> > > > > > > >
> > > > > > > > Also, creating the fs with both blockszie and fragment size of 64k
> > > > > > > > will make fsck faster (due to less inodes), but that should only be
> > > > > > > > done if the files you are going to store ar relatively big (generally
> > > > > > > > much bigger than 64k).
> > > > > > >
> > > > > > > Good to know. This will be mostly large files indeed.
> > > > > > > That would be "newfs -i 64"?
> > > > > >
> > > > > > Nope, newfs -b 65536 -f 65536
> > > > >
> > > > > To clarify: the default block size for large filesystems is already
> > > > > 2^16, but this value is taken from the label, so if another fs was on
> > > > > that partition before, it might have changed. The default fragsize is
> > > > > blocksize/8. When not specified on the command line, it is also taken
> > > > > from the label.
> > > > >
> > > > > Inode density is derived from the number of frgaments (normally 1
> > > > > inoder per 4 fragments), if you increase framgent size, the number of
> > > > > fragments drops and so the number if inodes.
> > > > >
> > > > > A fragment is the minimal alloctation unit. So if you have lots of
> > > > > small files you will waste a lot of space and potentially run out of
> > > > > inodes. You only want to increase fragment size of you mostly store
> > > > > large files.
> > > >
> > > > This is for large files only.
> > > >
> > > > 16 partitions:
> > > > # size offset fstype [fsize bsize cpg]
> > > > a: 117199339520 0 4.2BSD 65536 65536 52270 # /data
> > > > c: 117199339520 0 unused
> > > >
> > > > The new FS now has:
> > > >
> > > > new# df -hi /data
> > > > Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
> > > > /dev/sd1a 54.5T 64.0K 51.8T 0% 1 229301757 0% /data
> > > >
> > > > The server I am replacing has:
> > > >
> > > > old# df -hi /data
> > > > Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
> > > > data 35T 34T 539G 98% 104k 1.1G 0% /data
> > > >
> > > > I guess we are good. :)
> > > >
> > > > Mischa
> > > >
> > >
> > > How quick (slow?) is an fsck of this fs? Just unmount and run time fsck -f
> >
> > # time fsck -f /data
> > ** /dev/sd1a (4a7a7b3a44fbd513.a)
> > ** File system is already clean
> > ** Last Mounted on /data
> > ** Phase 1 - Check Blocks and Sizes
> > ** Phase 2 - Check Pathnames
> > ** Phase 3 - Check Connectivity
> > ** Phase 4 - Check Reference Counts
> > ** Phase 5 - Check Cyl groups
> > 1 files, 1 used, 914719746 free (0 frags, 914719746 blocks, 0.0% fragmentation)
> > 1m02.29s real 0m06.09s user 0m02.61s system
> >
> > The raid6 is still being build, not sure if that matters.
> >
> > # bioctl -h sd1
> > Volume Status Size Device
> > mfii0 1 Scrubbing 54.6T sd1 RAID6 2% done 10216 seconds WB
> >
> > Mischa
> >
>
> The scrubbing wil eat some bandwidth, but no idea how much.
>
> fsck wil get slower once you start filling it, but since your original
> fs had about 104k files it expect it not getting too bad. If the speed
> for your usecase is good as well I guess you should be fine.

Will see how it behaves and try to document as much as possible.
I can always install another BSD on it. ;)

Mischa

No comments:

Post a Comment