On Wed, Sep 18, 2024 at 11:17:45AM +1000, David Gwynne wrote:
> On Mon, Sep 16, 2024 at 09:57:18PM -0700, Bryan Vyhmeister wrote:
> > On Tue, Sep 17, 2024 at 02:31:09PM +1000, David Gwynne wrote:
> > >
> > > On Mon, Sep 16, 2024 at 12:25:35PM -0700, Bryan Vyhmeister wrote:
> > > > I am attempting to build a proof of concept of how to use vxlan(4)
> > > > on OpenBSD in a fully meshed OSPF network with [wireless] links
> > > > between sites under my full control so mtu is not an issue (mtu 1550
> > > > for vxlan0 and mtu 1600 or higher for hardware interfaces). The goal
> > > > is to bridge a group of VLANs between sites A, B, and C.
> > <snip>
> > >
> > > vxlan(4) in learning mode relies on a single multicast capable
> > > underlay network between all sites/points. if you are using separate
> > > interfaces on A to talk to B and C, then this requirement isn't
> > > satisfied.
> > >
> > > i dont know enough about multicast routing to know if or how i should
> > > support vxlan in learning mode with routes to multiple interfaces.
> >
> > Thanks for your response. That makes sense then if that is how things
> > are underneath. I'm not that familiar with how multicast routing works
> > either but that does appear to be how commercial vendors'
> > implementations work from what I have read.
>
> they rely on routes?
I think it relies on PIM which I just found out is not supported. Again,
I'm not too familiar with PIM. I could also use a Juniper or some or
other switch to do all of the OSPF routing and provide the multicast
routing environment and then just attach OpenBSD routers for running the
vxlan(4) only but I would prefer to do everything in OpenBSD.
> > > > I also tried using a WireGuard overlay on top of this network. With
> > > > wg0 as the parent but that does not seem to work either in vxlan(4)
> > > > learning mode unless I am missing something.
> > >
> > > wireguard as an underlay for vxlan in learning mode doesn't work
> > > because wg isn't multicast capable. the cryptokey routing thing doesnt
> > > support sending a packet destined to a single address (eg, 239.0.0.1)
> > > to multiple peers (ie, B and C).
> >
> > I was testing BGP over tunnels and noticed that ospf6d will not function
> > over wg(4) either.
>
> wg is neither multicast or point-to-point, and it completely ignored
> existing point to multipoint semantics. so yeah. it feels pretty clumsy
> when you try to do interesting stuff beyond what it was specifically
> created for.
Once I realized wg(4) wouldn't work, my solution was to use a gif(4)
tunnel or etherip(4) bridged with veb(4) to a vport(4) but I think the
gif(4) solution is simpler. Either solution worked fine for ospfd and
ospf6d as well as BGP over IPv4 and IPv6. Is there a performance benefit
with etherip(4) and vport(4) rather than gif(4)?
> openbsd lets you combine vlans and bridges/vebs/tpmr and tunnels in
> pretty arbitrary ways. there's advantages to doing everything in
> software sometimes.
It's quite nice to have so many flexible options.
> etherip(4) is the lowest overhead ethernet over ip tunnel interface, but
> you can only have one etherip tunnel between 2 endpoints. you can add
> vlans on top of etherip, or you can use egre/vxlan/etc with different
> vnetids instead.
I had not tried using VLANs over etherip(4) but that is a good idea and
maybe better than trying to get vxlan(4) to do what I want. My plan is
to feed the site A hardware ethernet interface from a switch with all
traffic being tagged with VLAN tags. At sites B and C (and D, E, etc.),
the hardware ethernet interface would plug right into a switch port that
will be prepared for the tagged traffic as well. I'm essentially
building a network ring and that's where I thought vxlan(4) would work
well. Once I have this setup properly, I don't anticipate needing to
make that many changes to the OpenBSD setup and can just add and remove
VLANs from the managed switches as needed.
> a couple of notes though:
>
> veb (and bridge) are not vlan aware. this means they will not scope the
> mac addresses they learn by vlan ids, and apart from the link0 flag on
> veb they don't let you filter vlans. if you want to control individual
> vlans, create a veb for a specific networks and add vlan (or
> egre/vxlan/etc) interfaces to it.
That will not be necessary in this design but I appreciate that
explanation. That makes sense. I won't need to filter at this level but
can leave that to the switch.
> it can be helpful to know the order of processing in the ethernet stack
> for packets rxed on an interface, which is currently best documented
> by comments in ether_input():
>
> * Process a received Ethernet packet.
> *
> * Ethernet input has several "phases" of filtering packets to
> * support virtual/pseudo interfaces before actual layer 3 protocol
> * handling.
> *
> * First phase:
> *
> * The first phase supports drivers that aggregate multiple Ethernet
> * ports into a single logical interface, ie, aggr(4) and trunk(4).
>
> * Second phase: service delimited packet filtering.
> *
> * Let vlan(4) and svlan(4) look at "service delimited"
> * packets. If a virtual interface does not exist to take
> * those packets, they're returned to ether_input() so a
> * bridge can have a go at forwarding them.
>
> * Third phase: bridge processing.
> *
> * Give the packet to a bridge interface, ie, bridge(4),
> * veb(4), or tpmr(4), if it is configured. A bridge
> * may take the packet and forward it to another port, or it
> * may return it here to ether_input() to support local
> * delivery to this port.
>
> * Fourth phase: drop service delimited packets.
> *
> * If the packet has a tag, and a bridge didn't want it,
> * it's not for this port.
>
> * Fifth phase: destination address check.
> *
> * Is the packet specifically addressed to this port?
>
> * Sixth phase: protocol demux.
> *
> * At this point it is known that the packet is destined
> * for layer 3 protocol handling on the local port.
That's very helpful in understanding how this works. Thank you.
I'm still not clear on exactly what protected accomplishes with veb(4).
You mentioned that prevents loops but I don't understand how.
Essentially, at this point, I think I can have etherip(4) links between
each site maybe in a close to fully meshed layout particularly back to
site A and, as long as I put the etherip(4) interfaces into the veb(4)
as protected, I will not have loops? Is that a correct understanding of
what you said?
Bryan
No comments:
Post a Comment