Wednesday, August 04, 2021

Re: Can't figure out what's taking up space on /

I'm at a loss, I booted in single user mode, ran fsck on /dev/sd0a and it
shows clean. I still have a large discrepancy between df and du.



On Wed, Aug 4, 2021 at 2:45 AM Greg Thomas <get.misc.openbsd@gmail.com>
wrote:

> Will do, but I should add that I have done nothing on this box for a
> couple of months. The day before yesterday I realized that I really needed
> to backup my laptop, when I went to run my backup script I discovered that
> I couldn't reach this server. When I went to troubleshoot I couldn't login
> so I hard rebooted it.
>
> After running fsck in single user mode and letting it fix things I ended
> up being able to log in which is when I discovered that / was full.
>
> Anyway, I'll boot into single user mode later today. I shouldn't be
> troubleshooting things in the middle of the night. I probably could have
> retained some more info about my situation if I had waited til the morning
> to troubleshoot the other night.
>
> On Wed, Aug 4, 2021 at 1:14 AM Paul de Weerd <weerd@weirdnet.nl> wrote:
>
>> On Wed, Aug 04, 2021 at 12:56:57AM -0700, Greg Thomas wrote:
>> | I take it I'm dealing with filesystem corruption as Ali mentioned
>> earlier?
>>
>> Could be. Boot the system in single user mode or the bsd.rd
>> installation kernel (at the boot prompt type either 'boot -s' or 'boot
>> bsd.rd'). Enter the shell and run `fsck /`.
>>
>> However, my next guess is that you have some data stored "under" a
>> mountpoint somewhere. Here's what I mean:
>>
>> # mkdir /mnt/test
>> # du -sh install69.iso
>> 544M install69.iso
>> # cp install69.iso /mnt/test
>> # du -xsh /mnt
>> 545M /mnt
>> # vnconfig vnd0 /mnt/test/install69.iso
>> # mount /dev/vnd0c /mnt/test/
>> # du -xsh /mnt
>> 8.0K /mnt
>>
>> Since du can't traverse the hierarchy that the install69.iso image has
>> been mounted over, it also cannot report on the diskspace used by
>> files in that hierarchy.
>>
>> Again, boot into single user mode (or from bsd.rd) and figure this
>> out.
>>
>> Cheers,
>>
>> Paul 'WEiRD' de Weerd
>>
>> | On Tue, Aug 3, 2021 at 11:10 PM Otto Moerbeek <otto@drijf.net> wrote:
>> |
>> | > On Tue, Aug 03, 2021 at 10:57:42PM -0700, Greg Thomas wrote:
>> | >
>> | > > I thought Paul's advice only applies if I was trying to figure it
>> out
>> | > > before rebooting? I'd already rebooted before sending my first
>> email.
>> | >
>> | > OK, did the free space come back in df after reboot? If so, then it's
>> | > programs having open files that are unlinked for sure.
>> | >
>> | > -Otto
>> | >
>> | > >
>> | > >
>> | > >
>> | > > On Tue, Aug 3, 2021 at 10:40 PM Otto Moerbeek <otto@drijf.net>
>> wrote:
>> | > >
>> | > > > On Tue, Aug 03, 2021 at 12:39:54PM -0700, Greg Thomas wrote:
>> | > > >
>> | > > > > I'm definitely suffering from filesystem corruption on root. I
>> had
>> | > > > > rebooted last night with no change.
>> | > > > >
>> | > > > > I have no options for mounting root.
>> | > > > >
>> | > > > > grits# cat /etc/fstab
>> | > > > > 16a27b4b4549ce04.b none swap sw
>> | > > > > 16a27b4b4549ce04.a / ffs rw 1 1
>> | > > > > 16a27b4b4549ce04.k /home ffs rw,nodev,nosuid 1 2
>> | > > > > 16a27b4b4549ce04.d /tmp ffs rw,nodev,nosuid 1 2
>> | > > > > 16a27b4b4549ce04.f /usr ffs rw,nodev 1 2
>> | > > > > 16a27b4b4549ce04.g /usr/X11R6 ffs rw,nodev 1 2
>> | > > > > 16a27b4b4549ce04.h /usr/local ffs rw,wxallowed,nodev 1 2
>> | > > > > 16a27b4b4549ce04.j /usr/obj ffs rw,nodev,nosuid 1 2
>> | > > > > 16a27b4b4549ce04.i /usr/src ffs rw,nodev,nosuid 1 2
>> | > > > > 16a27b4b4549ce04.e /var ffs rw,nodev,nosuid 1 2
>> | > > > > /dev/sd1c /backup ffs rw,nodev,nosuid 1 2
>> | > > > >
>> | > > > > I need to upgrade so I can do that from scratch. This is my
>> backup
>> | > > > server
>> | > > > > so the configuration is pretty simple.
>> | > > > >
>> | > > > > Not sure fsck output helps here?
>> | > > > >
>> | > > > > grits# fsck /dev/sd0a
>> | > > > > ** /dev/rsd0a (NO WRITE)
>> | > > > > ** Last Mounted on /
>> | > > > > ** Root file system
>> | > > > > ** Phase 1 - Check Blocks and Sizes
>> | > > > > ** Phase 2 - Check Pathnames
>> | > > > > ** Phase 3 - Check Connectivity
>> | > > > > ** Phase 4 - Check Reference Counts
>> | > > > > ** Phase 5 - Check Cyl groups
>> | > > > > 12852 files, 469195 used, 35516 free (44 frags, 4434 blocks,
>> 0.0%
>> | > > > > fragmentation)
>> | > > > >
>> | > > > > Anyway, I'll reinstall unless someone has more learning
>> experiences
>> | > for
>> | > > > me.
>> | > > > >
>> | > > > > And thank you to Paul for giving a quick explanation of the
>> | > difference
>> | > > > > between df and du.
>> | > > > >
>> | > > > > Thanks all!
>> | > > >
>> | > > > fsck looks normal for a mounted filesystem.
>> | > > >
>> | > > > but did you try following Paul's advice to find an open file that
>> has
>> | > > > no directory entry? That is not corruption, but explains why more
>> | > > > storage is in use than du shows.
>> | > > >
>> | > > > -Otto
>> | > > >
>> | > > > >
>> | > > > >
>> | > > > >
>> | > > > > On Tue, Aug 3, 2021 at 11:39 AM Ali Farzanrad <
>> | > ali_farzanrad@riseup.net>
>> | > > > > wrote:
>> | > > > >
>> | > > > > > I also suspected that it is a filesystem corruption.
>> | > > > > > Do you have `async` mount option on your root?
>> | > > > > >
>> | > > > > > Sebastien Marie <semarie@online.fr> wrote:
>> | > > > > > > On Tue, Aug 03, 2021 at 10:03:44AM +0200, Paul de Weerd
>> wrote:
>> | > > > > > > > df shows you how much data you can write to an fs, while
>> du
>> | > shows
>> | > > > the
>> | > > > > > > > disk usage of files it can find. If it can't find a file
>> | > (because
>> | > > > > > > > it's been deleted), it won't account for it. But if it's
>> been
>> | > > > deleted
>> | > > > > > > > and still held open by some process, it would still
>> consume
>> | > disk
>> | > > > > > > > space.
>> | > > > > > > >
>> | > > > > > > > So it looks like a process has a file open on the root
>> | > filesystem
>> | > > > that
>> | > > > > > > > has been deleted. You're looking for a root-owned process
>> | > that is
>> | > > > > > > > (probably) long-running. My guess the file is in /dev/
>> | > (that's my
>> | > > > > > > > crystal ball talking though).
>> | > > > > > > >
>> | > > > > > > > Easiest way out is generally to reboot - this stops all
>> | > processes
>> | > > > > > > > (d0h), dus freeing up all the resources they had tied up,
>> | > including
>> | > > > > > > > files that had been deleted from the filesystem. But
>> going
>> | > through
>> | > > > > > > > your process list to see if you can spot something that
>> may
>> | > have
>> | > > > done
>> | > > > > > > > this can be a good learning experience. In general, base
>> | > OpenBSD
>> | > > > > > > > daemons don't behave this way.
>> | > > > > > >
>> | > > > > > > I agree with Paul: you should have a running process which
>> hold
>> | > > > > > > descriptor on unlinked file.
>> | > > > > > >
>> | > > > > > > fstat(1) could be used to see list of opened files, and
>> specially
>> | > > > > > > unlinked files:
>> | > > > > > >
>> | > > > > > > INUM The inode number of the file. It will be
>> followed
>> | > by an
>> | > > > > > asterisk
>> | > > > > > > ('*') if the inode is unlinked from disk.
>> | > > > > > >
>> | > > > > > >
>> | > > > > > > $ fstat | grep -F '* -'
>> | > > > > > > [...]
>> | > > > > > > semarie chrome 537 25 /tmp 48* -rw-------
>> | > rwp
>> | > > > > > 279793
>> | > > > > > > [...]
>> | > > > > > >
>> | > > > > > > here, chrome (pid 537) has descriptor 25 opened to a file
>> on /tmp
>> | > > > > > > inode=48 (unlinked), the file size is 279793 bytes.
>> | > > > > > >
>> | > > > > > > --
>> | > > > > > > Sebastien Marie
>> | > > > > > >
>> | > > > > > >
>> | > > > > >
>> | > > > > >
>> | > > >
>> | >
>>
>> --
>> >++++++++[<++++++++++>-]<+++++++.>+++[<------>-]<.>+++[<+
>> +++++++++++>-]<.>++[<------------>-]<+.--------------.[-]
>> http://www.weirdnet.nl/
>>
>

No comments:

Post a Comment