Hello everyone!
Now here's a mysterious one -- I've been working on this for weeks and still have no clue what's causing it.
I have couple of virtual servers, running *the* latest 64-bit snapshots. They're backed up using rsync over ssh. For a long time (months / years) everything was working fine, but recently (maybe a couple of months ago) I noticed that the rsync transfers keep getting cancelled.
I used to have a rather fancy /etc/pf.conf, with anti brute force stuff for ssh, but even after disabling all that, the issue still persists.
As long os pf is enabled, rsync over ssh suddenly breaks down -- sometimes before even transferring anything, sometimes after transferring couple of megabytes, sometimes after tens or even hundreds of megabytes.
On client side, the following error is shown:
"client_loop: send disconnect: Broken pipe
rsync: connection unexpectedly closed (15936376 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(226) [receiver=3.1.3]
rsync: connection unexpectedly closed (1076081 bytes received so far) [generator]
rsync error: unexplained error (code 255) at io.c(226) [generator=3.1.3]"
On server side, the sshd logs the error message: "process_output: ssh_packet_write_poll: Connection from user TheUserName xxx.xxx.xxx.xxx port nnnnn: Permission denied"
I tried running the ssh daemon in debug mode, but even then the best I could get was the above permission denied error -- no further details.
As soon as I disable pf entirely, the problem goes away.
Unlike the rsync over ssh sessions, my normal ssh console sessions stay awake without any problems.
Looking at the traffic with tcpdump reveals that the server is sending TCP resets when the connection breaks down.
Just to make sure it's not a memory issue, the user that's used to do the backups is in the staff login class.
Also, to make sure the problem is not in rsync, I tried using scp with the exact same results.
Here's my simplified /etc/pf.conf:
================================================================================
set reassemble yes
set block-policy drop
set loginterface egress
block drop log all label default_deny
block return out log proto {tcp udp} user _pbuild
match in all scrub (no-df random-id max-mss 1440)
block in quick log from urpf-failed label uRPF_check_failed
pass in quick log (to pflog0) on egress proto tcp \
from any port > 1023 \
to (egress) port 20 \
user root \
flags S/SA keep state
================================================================================
Any ideas on how to debug this further?
Yours,
Jyri
--
+358-404-177133 (WhatsApp)
jyri.hovila@turvamies.fi
No comments:
Post a Comment