Saturday, July 31, 2021

Re: WireGuard host crashes roughly every week

Hi Todd!

You're right, the number of mbufs on the machine in question is steadily climbing.

This is a few minutes after a reboot, with an RC script starting wireguard automatically:

> 27836 mbufs in use:
> 27827 mbufs allocated to data
> 3 mbufs allocated to packet headers
> 6 mbufs allocated to socket names and addresses
> 0/16 mbuf 2048 byte clusters in use (current/peak)
> 20/75 mbuf 2112 byte clusters in use (current/peak)
> 0/8 mbuf 4096 byte clusters in use (current/peak)
> 0/0 mbuf 8192 byte clusters in use (current/peak)
> 0/0 mbuf 9216 byte clusters in use (current/peak)
> 0/0 mbuf 12288 byte clusters in use (current/peak)
> 0/0 mbuf 16384 byte clusters in use (current/peak)
> 0/0 mbuf 65536 byte clusters in use (current/peak)
> 7192/7192/524288 Kbytes allocated to network (current/peak/max)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines

And then, just a second or two later:

> 27874 mbufs in use:
> 27863 mbufs allocated to data
> 5 mbufs allocated to packet headers
> 6 mbufs allocated to socket names and addresses
> 0/16 mbuf 2048 byte clusters in use (current/peak)
> 20/75 mbuf 2112 byte clusters in use (current/peak)
> 0/8 mbuf 4096 byte clusters in use (current/peak)
> 0/0 mbuf 8192 byte clusters in use (current/peak)
> 0/0 mbuf 9216 byte clusters in use (current/peak)
> 0/0 mbuf 12288 byte clusters in use (current/peak)
> 0/0 mbuf 16384 byte clusters in use (current/peak)
> 0/0 mbuf 65536 byte clusters in use (current/peak)
> 7204/7204/524288 Kbytes allocated to network (current/peak/max)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines

From the nearly identical Pi (sans wireguard):

> 72 mbufs in use:
> 42 mbufs allocated to data
> 1 mbuf allocated to packet headers
> 29 mbufs allocated to socket names and addresses
> 12/64 mbuf 2048 byte clusters in use (current/peak)
> 0/0 mbuf 2112 byte clusters in use (current/peak)
> 0/8 mbuf 4096 byte clusters in use (current/peak)
> 0/0 mbuf 8192 byte clusters in use (current/peak)
> 0/0 mbuf 9216 byte clusters in use (current/peak)
> 0/0 mbuf 12288 byte clusters in use (current/peak)
> 0/0 mbuf 16384 byte clusters in use (current/peak)
> 0/0 mbuf 65536 byte clusters in use (current/peak)
> 216/216/131072 Kbytes allocated to network (current/peak/max)
> 0 requests for memory denied
> 0 requests for memory delayed
> 0 calls to protocol drain routines


I tried disabling the wg startup. When I start the box I have very few mbufs (around 50) like on the other machine. Once I start wireguard manually it begins climbing again, though the number is nowhere near the "27836 mbufs in use" like when it loads at boot.

When I stop wireguard (with wg-quick, destroying the interface), the number of mbufs stays where it is but stops climbing.

What should I do next?

--Matt

> On Jul 30, 2021, at 9:31 AM, Todd C. Miller <Todd.Miller@sudo.ws> wrote:
> On Thu, 29 Jul 2021 20:09:12 -0500, "Matt P." wrote:
>
>> I have an OpenBSD box that breaks after a week or so of running. All network
>> traffic stops reaching the box. If I look at the screen or serial output, I c
>> an get the "login:" prompt, and when I enter my name I get prompted for a pas
>> sword, but once I enter a password it hangs. Key presses and control codes st
>> ill show on the screen, but the login never succeeds or fails. I thought cont
>> rol-C might cause it to go back to the login prompt, but it doesn't. I have t
>> o hard reboot the box to get it back.
>
> This may be due to a memory leak. You could monitor the output of
> "netstat -m" and also "vmstat -m" and watch for memory use increasing
> over time. The number of mbufs in use reported by "netstat -m"
> should be relatively stable.
>
> - todd

No comments:

Post a Comment