Monday, March 22, 2021

Re: aggr+vlan lost packets

Dear List!

We make some tests, i think this is intel em driver (82571EB) bug!

* if i move aggr0 from em devices to bnx devices, everything will be fine!
(only change trunkport from em to bnx)
* if i change intel network card to other intel network card with
82571EB chipset, not working.
* if i copy network interfaces config to another server (clear openbsd
6.8 install) with 6x Intel I210 network cards, everything will be fine!
* if i move SSD from working intel configuration server (I210) to
PE210 (82571EB), not working.
* i tested with oBSD 6.7, the problem exists ., but before reinstall
this server, on oBSD 6.1, LACP + 82571EB is working correctly.

we have many-many OpenBSD (router, firewall) installations, but we have
not yet experienced this problem. If possible, we use intel network cards.

--
Üdvözlettel,
Szél Gábor

WanTax Kft.
------------
tel.: +36 20 3838 171
fax: +36 82 357 585
email:gabor.szel@wantax.hu
web:http://wantax.hu
web:http://halozatom.hu



2021. 03. 22. 12:06 keltezéssel, Szél Gábor írta:
> Dear List!
>
> We have very interesting problem!
> We are reinstalling a OpenBSD firewall (6.1 -> 6.8), and we connect
> new servers  to firewall.
> We replaced firewall for the duration of the update, is not currently
> production use.
>
> Test configuration:
> - Dell PE210 II (Firewall) 2x Broadcom BCM5716 (bnx) integrated for
> WANs, 4x Intel 82571EB (em) PCIexp, for LANs
> - Dell PE740 (Proxmox) - 2x 1G, 2x10G (currently 1G links) (node1)
> - Dell PE740 (Proxmox) - 2x 1G, 2x10G (currently 1G links) (node2)
> - Cisco 2960s-48ts-s switch
>
> All devices connected to switch, witch LACP:
> - firewall 4x em{0|1|2|3} - aggr0
> - nodes 2x1G (eno)
> - we configured only one VLAN, VLAN2 for managment (10.110.2.0/24)
> - All device used tagged VLAN for managment. (but we tested, untaged,
> native VLAN, no difference)
> - no PF rules! clear OpenBSD install!
>
> I describe the configurations at the end of the letter!
>
> IPs:
> - FW - 10.110.2.1 (on managment vlan if)
> - switch - 10.110.2.11 (on managment vlan if)
> - node 1 - 10.110.2.51 (on vmbr2 if tagged, or vmbr0 if untagged)
> - node 1 - 10.110.2.52 (on vmbr2 if tagged, or vmbr0 if untagged)
>
> Problem:
> - all LACP is UP, no problems reported
> - nodes see each other nodes
> - nodes see switch managment VLAN IP address
> - firewall see switch managment VLAN IP address
> but,
> - node 1 see firewall IP address
> - node 2 NOT see firewall IP address
> - if we changed bond parameters on nodes - from 2x10G network if to
> 2x1G network if, node 2 see friewall, node 1 NOT see firewall
> (different MAC address)
> interesting problem:
> - if a start tcpdump on firewall vlan2 or aggr0 interface, everything
> will be fine!!! if i stoped tcpdump, bad again! - what??? :)
> - if node 2 have packet lost, i start tcpdump in node 2, a see ICMP
> request, and reply packets from/to firewall!
>
> We use a lot of oBSD 6.8 firewalls with LACP + VLANs, we have not
> experienced this.
>
> _
> __Configurations:_
>
> oBSD:
>
> */etc/hostname.aggr0 *
> trunkport em0
> trunkport em1
> trunkport em2
> trunkport em3
> 172.19.253.1 netmask 255.255.255.255
> description "c1 LACP"
> up
>
> */etc/hostname.vlan2*
> inet 10.110.2.1 255.255.255.0 10.110.2.255 vnetid 2 parent aggr0
> description "Managment"
>
> *sysctl.conf*
> net.inet.ip.forwarding=1       # 1=Permit forwarding (routing) of IPv4
> packets
> net.inet.carp.log=3            # log level of carp(4) info, default 2
> machdep.kbdreset=1             # permit console CTRL-ALT-DEL to do a
> nice halt
> ddb.panic=0                    # do not enter ddb console on kernel
> panic, reboot if possible
> kern.bufcachepercent=90        # Allow the kernel to use up to 90% of
> the RAM for cache (default 10%)
> net.inet.ip.forwarding=1       # Permit forwarding (routing) of
> packets through the firewall
> net.inet.ip.mtudisc=0          # TCP MTU (Maximum Transmission Unit)
> discovery off since our mss is small enough
> net.inet.tcp.rfc3390=1         # Enable RFC3390 TCP window increasing
> so larger CWND can take affect
> vm.swapencrypt.enable=1         # encrypt pages that go to swap
> machdep.kbdreset=1              # permit console CTRL-ALT-DEL to do a
> nice halt
> hw.allowpowerdown=1             # 0=Disable power button shutdown
> hw.smt=1            # HT
>
> *Cisco 2960S
>
> *interface Port-channel1
>  description FW
>  switchport mode trunk
>  switchport nonegotiate
> !
> interface Port-channel2
>  description n1.pve
>  switchport mode trunk
>  switchport nonegotiate
> !
> interface Port-channel3
>  description n2.pve
>  switchport mode trunk
>  switchport nonegotiate
> !
> interface GigabitEthernet0/1
>  description n1.pve
>  switchport mode trunk
>  switchport nonegotiate
>  spanning-tree portfast trunk
>  channel-group 2 mode active
> !
> interface GigabitEthernet0/2
>  description n1.pve
>  switchport mode trunk
>  switchport nonegotiate
>  spanning-tree portfast trunk
>  channel-group 2 mode active
> !
> interface GigabitEthernet0/3
>  description n2.pve
>  switchport mode trunk
>  switchport nonegotiate
>  channel-group 3 mode active
> !
> interface GigabitEthernet0/4
>  description n2.pve
>  switchport mode trunk
>  switchport nonegotiate
>  channel-group 3 mode active
> !
> *
> *interface GigabitEthernet0/45
>  description FW-LACP
>  switchport mode trunk
>  switchport nonegotiate
>  channel-group 1 mode active
> !
> interface GigabitEthernet0/46
>  description FW-LACP
>  switchport mode trunk
>  switchport nonegotiate
>  channel-group 1 mode active
> !
> interface GigabitEthernet0/47
>  description FW-LACP
>  switchport mode trunk
>  switchport nonegotiate
>  channel-group 1 mode active
> !
> interface GigabitEthernet0/48
>  description FW-LACP
>  switchport mode trunk
>  switchport nonegotiate
>  channel-group 1 mode active
>
>
>
> --
> Üdvözlettel,
> Szél Gábor
>
> WanTax Kft.
> ------------
> tel.: +36 20 3838 171
> fax: +36 82 357 585
> email:gabor.szel@wantax.hu
> web:http://wantax.hu
> web:http://halozatom.hu

No comments:

Post a Comment