Monday, May 29, 2017

Re: isakmpd dies quietly with over 100 tunnels

Hi all,

I got to admit I've seen isakmpd dying on 5.9*
(amd64 on VMware). But after having to deal
with half a dozen peers all over Europe using
different proprietary solutions a cronjob like
"rcctl ls faulty | grep isakmpd && rcctl restart…"
worked well enough for me.

I won't be able to test with the setup at work
but I got a little VPS running 6.1 I could use
(and update to -STABLE if necessary).
We probably won't get to over 100 tunnels but
I've seen the problem with ~8 tunnels.
The question would be if this problem would
even show up in a homogeneous OpenBSD
network…

Regards, Florian

*) the central hub isn't my problem anymore,
and it will take some time to convince my
replacement there to update to 6.1…

Am 29. Mai 2017 09:26:18 MESZ schrieb Alexis VACHETTE <avachette@sisteer.com>:
>Hi Michał,
>
>I'm having same issue without 100 ipsec tunnels and dedicated hardware.
>
>Unfortunately it's a production environment so I can't really
>troubleshooting this issue to track down the culprit.
>
>Anyway maybe it's not related to your issue.
>
>Regards,
>Alexis.
>On 28/05/2017 14:31, Michał Koc wrote:
>> Hi all,
>>
>> I'm running 6.0/amd64 inside KVM/Quemu with over 100 ipsec tunnels.
>>
>> Everything was running just fine when the number of tunnels was
>lower.
>> But as we have been setting up more and more tunnels we suddenly run
>> on problems.
>> The isakmpd deaemon keeps dying quietly. Probably I'm running out of
>> something, but I need some help to find out what it is and how to
>> monitor it and tweak.
>>
>> Thank You in advance.
>>
>> Best Regards
>> M.K.
>>
>> root@vgate0:/root# netstat -m
>> 215 mbufs in use:
>> 163 mbufs allocated to data
>> 46 mbufs allocated to packet headers
>> 6 mbufs allocated to socket names and addresses
>> 160/920/6144 mbuf 2048 byte clusters in use (current/peak/max)
>> 0/8/6144 mbuf 4096 byte clusters in use (current/peak/max)
>> 0/8/6144 mbuf 8192 byte clusters in use (current/peak/max)
>> 0/14/6146 mbuf 9216 byte clusters in use (current/peak/max)
>> 0/10/6150 mbuf 12288 byte clusters in use (current/peak/max)
>> 0/8/6144 mbuf 16384 byte clusters in use (current/peak/max)
>> 0/8/6144 mbuf 65536 byte clusters in use (current/peak/max)
>> 2760 Kbytes allocated to network (13% in use)
>> 0 requests for memory denied
>> 0 requests for memory delayed
>> 0 calls to protocol drain routines
>>
>> Sample tail of the log:
>> When I run "isakmpd -K -d -DA=10":
>> 142043.246192 Sdep 10 pf_key_v2_set_spi: satype 2 dst xxx.xxx.xxx.xxx
>
>> SPI 0x42f03e5d
>> 142043.246209 Timr 10 timer_add_event: event
>> sa_soft_expire(0x1fb9d0bdf400) added before
>> sa_soft_expire(0x1fb9c8f05400), expiration in 25056s
>> 142043.246223 Timr 10 timer_add_event: event
>> sa_hard_expire(0x1fb9d0bdf400) added before
>> sa_soft_expire(0x1fb9dd458200), expiration in 28800s
>> 142043.246326 Sdep 10 pf_key_v2_set_spi: satype 2 dst xxx.xxx.xxx.xxx
>
>> SPI 0x3ffa5955
>> 142043.268229 Default responder_recv_HASH_SA_NONCE: KEY_EXCH payload
>> without a group desc. attribute
>> 142043.268250 Default dropped message from xxx.xxx.xxx.xxx port 500
>> due to notification type NO_PROPOSAL_CHOSEN
>> 142043.268281 Timr 10 timer_add_event: event
>> exchange_free_aux(0x1fb9a5336400) added before
>> sa_soft_expire(0x1fba0d6a2a00), expiration in 120s
>> 142043.268289 Exch 10 exchange_establish_p2: 0x1fb9a5336400 <unnamed>
>
>> <no policy> policy initiator phase 2 doi 1 exchange 5 step 0
>> 142043.268295 Exch 10 exchange_establish_p2: icookie 8c58f4e7f8269ed3
>
>> rcookie 0fe2d7657125a339
>> 142043.268301 Exch 10 exchange_establish_p2: msgid de2c5cc3 sa_list
>> 142043.269079 Timr 10 timer_add_event: event
>> message_send_expire(0x1fb994136900) added before
>> connection_checker(0x1fb9b2646280), expiration in 7s
>> 142043.269614 Exch 10 exchange_finalize: 0x1fb9a5336400 <unnamed> <no
>
>> policy> policy initiator phase 2 doi 1 exchange 5 step 1
>> 142043.269630 Exch 10 exchange_finalize: icookie 8c58f4e7f8269ed3
>> rcookie 0fe2d7657125a339
>> 142043.269637 Exch 10 exchange_finalize: msgid de2c5cc3 sa_list
>> 142043.269653 Timr 10 timer_remove_event: removing event
>> exchange_free_aux(0x1fb9a5336400)
>> 142043.289465 Timr 10 timer_remove_event: removing event
>> message_send_expire(0x1fb994136900)
>> 142043.289513 Exch 10 exchange_finalize: 0x1fb972b59400
>> from-xxx.xxx.xxx.xxx/24-to-xxx.xxx.xxx.xxx/24 <no policy> policy
>> responder phase 2 doi 1 exchange 32 step 2
>> 142043.289521 Exch 10 exchange_finalize: icookie 8c58f4e7f8269ed3
>> rcookie 0fe2d7657125a339
>> 142043.289528 Exch 10 exchange_finalize: msgid de079ef6 sa_list
>> 0x1fb9dd458800 0x1fb985d09e00
>> 142043.289578 Sdep 10 pf_key_v2_set_spi: satype 2 dst xxx.xxx.xxx.xxx
>
>> SPI 0xe5d04953
>> 142043.289594 Timr 10 timer_add_event: event
>> sa_soft_expire(0x1fb9dd458800) added before
>> sa_soft_expire(0x1fba1d81de00), expiration in 3279s
>> 142043.289608 Timr 10 timer_add_event: event
>> sa_hard_expire(0x1fb9dd458800) added before
>> sa_soft_expire(0x1fba2c980800), expiration in 3600s
>> 142043.289710 Sdep 10 pf_key_v2_set_spi: satype 2 dst xxx.xxx.xxx.xxx
>
>> SPI 0x4d895568
>> root@vgate0:/root#
>>
>> OpenBSD 6.0-stable (GENERIC.MP) #0: Sat Feb 4 21:55:17 CET 2017
>> root@amd64.vcomp:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>> real mem = 1056956416 (1007MB)
>> avail mem = 1020506112 (973MB)
>> mpath0 at root
>> scsibus0 at mpath0: 256 targets
>> mainbus0 at root
>> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf1dc0 (11 entries)
>> bios0: vendor Bochs version "Bochs" date 01/01/2011
>> bios0: Bochs Bochs
>> acpi0 at bios0: rev 0
>> acpi0: sleep states S3 S4 S5
>> acpi0: tables DSDT FACP SSDT APIC HPET
>> acpi0: wakeup devices
>> acpitimer0 at acpi0: 3579545 Hz, 24 bits
>> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
>> cpu0 at mainbus0: apid 0 (boot processor)
>> cpu0: QEMU Virtual CPU version 2.1.2, 3492.32 MHz
>> cpu0:
>>
>FPU,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,POPCNT,HV,NXE,LONG,LAHF,ABM
>> cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
>
>> 64b/line 16-way L2 cache
>> cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries
>direct-mapped
>> cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries
>direct-mapped
>> cpu0: smt 0, core 0, package 0
>> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
>> cpu0: apic clock running at 1000MHz
>> cpu1 at mainbus0: apid 1 (application processor)
>> cpu1: QEMU Virtual CPU version 2.1.2, 3491.95 MHz
>> cpu1:
>>
>FPU,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,POPCNT,HV,NXE,LONG,LAHF,ABM
>> cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
>
>> 64b/line 16-way L2 cache
>> cpu1: ITLB 255 4KB entries direct-mapped, 255 4MB entries
>direct-mapped
>> cpu1: DTLB 255 4KB entries direct-mapped, 255 4MB entries
>direct-mapped
>> cpu1: smt 0, core 0, package 1
>> ioapic0 at mainbus0: apid 0 pa 0xfec00000, version 11, 24 pins
>> acpihpet0 at acpi0: 100000000 Hz
>> acpiprt0 at acpi0: bus 0 (PCI0)
>> acpicpu0 at acpi0: C1(@1 halt!)
>> acpicpu1 at acpi0: C1(@1 halt!)
>> "ACPI0006" at acpi0 not configured
>> "PNP0303" at acpi0 not configured
>> "PNP0F13" at acpi0 not configured
>> "PNP0700" at acpi0 not configured
>> "ACPI0007" at acpi0 not configured
>> "ACPI0007" at acpi0 not configured
>> pvbus0 at mainbus0: KVM
>> pci0 at mainbus0 bus 0
>> pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
>> pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
>> pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA,
>> channel 0 wired to compatibility, channel 1 wired to compatibility
>> pciide0: channel 0 disabled (no drives)
>> pciide0: channel 1 disabled (no drives)
>> uhci0 at pci0 dev 1 function 2 "Intel 82371SB USB" rev 0x01: apic 0
>> int 11
>> piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x03: apic
>
>> 0 int 9
>> iic0 at piixpm0
>> vga1 at pci0 dev 2 function 0 "Cirrus Logic CL-GD5446" rev 0x00
>> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
>> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
>> virtio0 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00
>> vio0 at virtio0: address 00:50:56:00:0e:94
>> virtio0: msix shared
>> virtio1 at pci0 dev 4 function 0 "Qumranet Virtio Network" rev 0x00
>> vio1 at virtio1: address 52:54:00:05:43:8a
>> virtio1: msix shared
>> virtio2 at pci0 dev 5 function 0 "Qumranet Virtio Storage" rev 0x00
>> vioblk0 at virtio2
>> scsibus1 at vioblk0: 2 targets
>> sd0 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, > SCSI3 0/direct
>
>> fixed
>> sd0: 4096MB, 512 bytes/sector, 8388608 sectors
>> virtio2: msix shared
>> virtio3 at pci0 dev 6 function 0 "Qumranet Virtio Memory" rev 0x00
>> viomb0 at virtio3
>> virtio3: apic 0 int 10
>> isa0 at pcib0
>> isadma0 at isa0
>> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
>> fd0 at fdc0 drive 1: density unknown
>> pckbc0 at isa0 port 0x60/5 irq 1 irq 12
>> pckbd0 at pckbc0 (kbd slot)
>> wskbd0 at pckbd0: console keyboard, using wsdisplay0
>> pms0 at pckbc0 (aux slot)
>> wsmouse0 at pms0 mux 0
>> pcppi0 at isa0 port 0x61
>> spkr0 at pcppi0
>> usb0 at uhci0: USB revision 1.0
>> uhub0 at usb0 "Intel UHCI root hub" rev 1.00/1.00 addr 1
>> nvram: invalid checksum
>> vscsi0 at root
>> scsibus2 at vscsi0: 256 targets
>> softraid0 at root
>> scsibus3 at softraid0: 256 targets
>> root on sd0a (51b2d1225ee6d760.a) swap on sd0b dump on sd0b
>>
>> --
>> M.K.
>>

No comments:

Post a Comment