Hi Cameron,
As a first guess, I would consider changing / implementing "set optimization". This made massive difference on our customers satellite internet connection.
man pf.conf
set optimization environment
Optimize state timeouts for one of the following network
environments:
aggressive
Aggressively expire connections. This can greatly reduce
the memory usage of the firewall at the cost of dropping
idle connections early.
conservative
Extremely conservative settings. Avoid dropping
legitimate connections at the expense of greater memory
utilization (possibly much greater on a busy network) and
slightly increased processor utilization.
high-latency
A high-latency environment (such as a satellite
connection).
normal A normal network environment. Suitable for almost all
networks.
satellite
Alias for high-latency.
The default value is normal.
-----Original Message-----
From: owner-misc@openbsd.org <owner-misc@openbsd.org> On Behalf Of Cameron Simpson
Sent: Tuesday, 1 June 2021 8:26 AM
To: misc@openbsd.org
Subject: pf, relayd, TCP keep alive and NAT, oh my!
Can I enforce or implement TCP keep alives on a TCP stream via my firewall?
Background:
I've got a client with an OpenBSD firewall and a Telstra NBN modem as their modem.
Their IMAP server is upstream in the cloud (Unbuntu, courier imap). I have this odd problem which I am beginning to suspect is the NBN modem getting bored and dropping its NAT entries. Let me explain...
At the firewall end I see about 30 ESTABLISHED connections to the IMAP server. At the IMAP server I see over 500, which is about where the IMAP service stops accepting new connections, leading to errors from the client mail readers.
My current theory is that the IMAP client connections issue the IMAP IDLE command and go passive, waiting for email notifications from the server. So we have an idle TCP connection across the firewall and across the NBN modem (which NATs).
My conjecture is that at some point the modem discards idle connection states. (This could just as well happen at any other intermediate stateful router too.) After that event, the client end does something which tries to use the connection, gets an RST from the modem, clean tidyup happens on the client and in the firewall.
At the server end, none of this is seen and the imapd just sits around idle, never releasing the connection and never stopping the matching daemon process. This gradually rises to hit the server's configured connection limit and it stops accepting new things.
If I had TCP keep alive turned on, both ends might tidy themselves up.
I can't enable that on the clients (various mail readers) or, apparently, on the server configuration. I can't do it in PF because PF just copies packets. I can't seem to do it in relayd either, though that seems the obvious way to intercept the connection for this purpose.
Any suggestions?
I haven't fully validated my conjecture yet, BTW. It just fits the symptoms I see.
Plan B is to build the latest courier-imap from source if I find the time, but there may be no build option for this. I guess a single
setsockopt() call in the source would be enough, _if_ that can be done on the accept end, which I haven't checked.
Plan B0 might be to disable IMAP IDLE support. Hmm.
Cheers,
Cameron Simpson <cs@cskk.id.au>
No comments:
Post a Comment