On Mon, 27 Jan 2025 19:29:24 +0000
Chris Billington <emulti@disroot.org> wrote:
> On Mon, 27 Jan 2025 14:02:17 +0000
> Chris Billington wrote:
>
> > I am setting up net-booting of amd64 clients with an amd64 server
> > (HP Z400) running 7.6-release, using the diskless(8) manpage.
> >
> > I want to set up a shared /usr nfs mount for the clients as described
> > in the manpage, so that a single set of installed packages can be
> > managed centrally.
> >
> > tftpd, bootparams, dhcpd and nfs are set up as decribed in the mapage
> >
> > Server IP (em0) 192.168.0.254
> > NFS rootfs /export is on a separate partition.
> >
> > Client IP (also em0) 192.168.0.2 as set statically by mac address in
> > dhcpd.conf
> >
> > /etc/exports on the server:
> > /export/client2 -maproot=root -alldirs 192.168.0.2
> > /usr -ro -network=192.168.0.0 -netmask=255.255.255.0
> > /var/db/pkg -ro -network=192.168.0.0 -netmask=255.255.255.0
> >
> > showmount -e:
> > /var/db/pkg 192.168.0.0
> > /usr 192.168.0.0
> > /export/client2 192.168.0.2
> >
> > /etc/fstab on client2:
> > 192.168.0.254:/export/client2 / nfs rw 0 0
> > 192.168.0.254:/usr/ nfs ro 0 0
> > swap /tmp mfs rw,-s=512M 0 0
> > 192.168.0.254:/var/db/pkg /var/db/pkg nfs 0 0
> >
> > When booting multi-user, all goes normally until the boot hangs at the
> > point of mounting /usr in the client's /etc/rc (line 489):
> >
> > mount -s /usr >/dev/null 2>&1 # if NFS, fstab must use IP address
> >
> > By removing the redirection temporarily, I can see the following error
> > on the client:
> >
> > mount_nfs: bad MNT RPC: RPC: Unable to send; errno = Permission denied
> >
> > This is repeated at intervals of 30 seconds or so.
> >
> > However, showmount -a on the server thinks /usr is mounted:
> >
> > showmount -a:
> > client2:/export/client2
> > client2:/usr
> > client2:/var/db/pkg
> >
> > At this point if I interrupt the processing of /etc/rc the boot
> > continues, but fails miserably because /usr is not mounted. (verified
> > with mountd -d on the server)
> >
> > If I do 'boot -s', after going to a shell it is possible to
> > mount /usr, /tmp and /var/db/pkg without issue.
> >
> > If I add the bg (backgrouund the mount task) option to the client's
> > fstab for /usr (ro,bg) then boot proceeds but /usr never gets mounted.
> >
> > As a check, I tried booting with a non-shared /usr in
> > the /export/client2 directory. Booting then works without problems. But
> > that defeats the object of net booting, to have a shared set of
> > installed packages.
> >
> > One strange thing that may be relevant:
> > If I listen with 'tcpdump -nvi em0' on the server, I can see the rpc
> > request going to the server port 111 over udp each time the client
> > attempts to mount /usr :
> >
> > 192.168.0.2.xxx > 192.168.0.254.111: [udp sum ok] udp 56 (ttl 64, id
> > xxxxx, len 84)
> >
> > But the reply back to the client from the server from the same port has:
> >
> > 192.168.0.254.111 > 192.168.0.2.xxx: [bad udp csum 8682! -> zzzz]] udp
> > 28 (ttl 64, id xxxxx, len 56)
> >
> > (xxxx, yyyy, zzzz are the random values chosen by the networking stack)
> >
> > Is it still possible to boot diskless clients with a shared /usr? What
> > could be the cause of the 'bad UDP csum' errors, and the 'mount_nfs bad
> > MNT RPC' error?
> > It's particularly odd because single-user boot allows /usr to be
> > mounted read-only without issue.
> >
> > I'm running out of things to try! All assistance in resolving this
> > gratefully accepted....
> >
> > Possibly unrelated: before the boot process of /bsd starts, I see the
> > following PXEboot error flash by:
> > pxe_netif_open : PXENV_UDP_OPEN failed: 0x60
> > net_open: netif_open() failed
> > However, after this the booting of /bsd continues as normal until the
> > nfs mount hang described above.
> >
> > --
> > Chris Billington
>
>
> Further infomation:
>
> - the bad checksum errors are a 'red herring', they occur because the
> network card supports checksum offloading and tcpdump sees the packets
> before the checksum is added.
>
> - changing the mount options for /usr to 'ro,tcp' results in slightly
> different error messages on the client when booting multiuser and
> trying to mount /usr readonly:
>
> first error:
> Cannot MNT RPC: RPC: Remote system error: Permission denied
>
> subsequent errors:
> mount_nfs: Bad MNT RPC: RPC: Unable to send; errno = Bad file descriptor
>
> --
> Chris Billington
>
>
I think I understand what was going on with read-only /usr over nfs
now.
The temporary pf ruleset loaded in /etc/rc contains "don't kill NFS"
rules which allow communication out to the portmap/sunrpc and nfs ports
on the server only, 111 and 2049:
But to mount a separate /usr the client needs to talk to the mountd RPC
at the reserved port number it gets from portmap, which is blocked by
pf. But the mountd port varies boot-to-boot, so it can't be easily
included in a rule as far as I know.
I tested this was the issue by hard-including the currently-running
mountd port number in the ruleset.
My workaround was to move the mount command for /usr to just before the
temporary pf ruleset is loaded.
Single-user boot does not load the temporary pf ruleset, and if /usr is
an integral part of the root filesystem, it does not get remounted by
the mount -s command in /etc/rc
I will make a report to bugs@ to see if this small change is possible to
accept for future releases.
--
Chris Billington
No comments:
Post a Comment