Tuesday, July 09, 2024

Re: Hard freeze during `pkg_add -u` on -current

Only thing I can really suggest at that point is uninstalling packages
and reinstalling them. The steps would be similar to those in faq 15
"Duplicating Installed Packages on Another Machine" but rather than
transferring "list" to another machine, pkg_delete /var/db/pkg/* and
install them locally.

If the pkg database in too bad a state to pkg_delete, you could produce
a list, move /var/db/pkg out of the way, and pkg_add using that list
over the top - there will be "missing package registration, do you want
to fix?" questions which you can answer yes to.



On 2024/07/05 15:01, Ronald Dahlgren wrote:
> Thank you for the reply, Stuart.
>
> Running pkg_check startout out fine and then went off the rails. The output is captured here ->
> https://sw.gy/files/pkg_check.html
>
> The control characters passed through xterm and a clipboard so they may not be accurate. Here
> are some screenshots of the original:
>
> https://sw.gy/files/pkg_check-1.png
> https://sw.gy/files/pkg_check-2.png
>
> Thankfully this behavior did not crash the system :)
>
> Ron
>
> On Fri, Jul 5, 2024 at 12:33 PM Stuart Henderson <stu.lists@spacehopper.org> wrote:
>
> On 2024-07-05, Ronald Dahlgren <ronald.dahlgren@gmail.com> wrote:
> > --000000000000cbf9af061c80339e
> > Content-Type: text/plain; charset="UTF-8"
> > Content-Transfer-Encoding: quoted-printable
> >
> > Hello,
> >
> > On July 2nd, I updated a machine to the latest snapshot and rebooted. It
> > came back without issue. I then issued `pkg_add -U`. This machine was last
> > updated on June 6th, so not terribly long ago. Partway during the process,
> > the disk indicated it was full (not true) and no commands were available
> > (ls, cd, etc). Unable to do anything, I terminated my SSH session and
> > attempted to reconnect. The machine failed to respond to pings. I had
> > someone onsite reboot the machine. It then came back up. I did not try the
> > `pkg_add -u` command again. Inspection showed that partitions had plenty of
> > available space and inodes.
> >
> > The daily insecurity output that ran the following day, on Wednesday the
> > 3rd, had this unusual snippet:
> >
> > ```
> > vmm-firmware-1.16.3p0 firmware binary images for vmm(4) driver
> > -xz-5.4.5            library and tools for XZ and LZMA compressed files
> > +xz-5.6.2
> > <FD>/??^L???.???<C5>/??<F4>???..??<FE>/??$???+DESC???<FF>/?????
> >  +CONTENTS????0??<C4>??^L+REQUIRED_BY??????????????????????????????????????=
> > ???????????????????????????????????????????????????????????????????????????=
> > ???????????????????????????????????????????????????????????????????????????=
> > ???????????????????????????????????????????????????????????????????????????=
> > ???????????????????????????????????????????????????????????????????????????=
> > ??????????????????????????????????????????????????????????????????????
> >  zsh-5.9p0           Z shell, Bourne shell-compatible
> > ```
>
> The filesystem holding /var/db/pkg has some corruption.
> I'd try running pkg_check and allow it to repair, reinstall xz
> "pkg_add -r -D installed xz", and see how you get on.
>
> > Given the package with the wacky description is `xz`, I'm more concerned
> > than I would be otherwise.
>
> The same could have happened to any package, there's nothing special
> about xz there.
>
> > I can see in `/var/log/messages` the snapshot update occurred without
> > issue. Logs after the physical reboot show no core dump and only have
> > complaints about filesystems not being properly unmounted - expected when
> > the plug is pulled.
> >
> > Are there any other logs I can check and share to help get to the bottom of
> > this? The impacted computer has been running current and humming along
> > happily in a network closet for over a year.
>
> Not sure about the disk full message (spurious seems unlikely - if space
> is ok, is some filesystem tight on inodes? df -hi) or the hang.
>
> --
> Please keep replies on the mailing list.
>
>

No comments:

Post a Comment