Wednesday, September 29, 2021

Re: problems with outbound load-balancing (PF sticky-address for destination IPs)

Ahhhhh,

Your diagram makes perfect sense now :) Thank you - So it does not have to
undergo a full rehashing of all links (which breaks _lots_ of sessions when
NAT is involved), but also does not have to explicitly track anything in
memory like you say 👍 So better than full re-hashing and cheaper than
tracking.

PS; Thank you for confirming; "It therefor routes the same src/dst pair
over the same nexthop as long as there are no changes to the route".
I was getting hung up on the bit in the RFC that says "hash over the packet
header fields that identify a flow", so I was imagining the hashing was
using a lot of entropy including the ports. I guess I should have thought
around that more and read it as "hash over the IP packet header fields that
identify a flow" ;)

I shall go and experiment :)


On Wed, Sep 29, 2021 at 8:45 PM Claudio Jeker <cjeker@diehard.n-r-g.com>
wrote:

> On Wed, Sep 29, 2021 at 08:07:43PM +1000, Andrew Lemin wrote:
> > Hi Claudio,
> >
> > So you probably guessed I am using 'route-to { GW1, GW2, GW3, GW4 }
> random'
> > (and was wanting to add 'sticky-address' to this) based on your reply :)
> >
> > "it will make sure that selected default routes are sticky to source/dest
> > pairs" - Are you saying that even though multipath routing uses hashing
> to
> > select the path (https://www.ietf.org/rfc/rfc2992.txt - "The router
> first
> > selects a key by performing a hash (e.g., CRC16) over the packet header
> > fields that identify a flow."), subsequent new sessions to the same dest
> IP
> > with different source ports will still get the same path? I thought a new
> > session with a new tuple to the same dest IP would get a different hashed
> > path with multipath?
>
> OpenBSD multipath routing implements gateway selection by Hash-Threshold
> from RFC 2992. It therefor routes the same src/dst pair over the same
> nexthop as long as there are no changes to the route. If one of your
> links drops then some sessions will move links but the goal of
> hash-threshold is to minimize the affected session.
>
> > "On rerouting the multipath code reshuffles the selected routes in a way
> to
> > minimize the affected sessions." - Are you saying, in the case where one
> > path goes down, it will migrate all the entries only for that failed path
> > onto the remaining good paths (like ecmp-fast-reroute ?)
>
> No, some session on good paths may also migrate to other links, this is
> how the hash-threshold algorithm works.
>
> Split with 4 nexthops, now lets assume link 2 dies and stuff gets
> reshuffled:
> +=================+=================+=================+=================+
> | link 1 | link 2 | link 3 | link 4 |
> +=================+=====+===========+===========+=====+=================+
> | link 1 | link 3 | link 4 |
> +=======================================================================+
> Unaffected sessions for drop
> ^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^
> Affected sessions because of drop
> ################# #####
> Unsing other ways to split the hash into buckets (e.g. a simple modulo)
> causes more change.
>
> Btw. using route-to with 4 gw will not detect a link failure and 25% of
> your traffic will be dropped. This is another advantage of multipath
> routing.
>
> Cheers
> --
> :wq Claudio
>
> > Thanks for your time, Andy.
> >
> > On Wed, Sep 29, 2021 at 5:21 PM Claudio Jeker <cjeker@diehard.n-r-g.com>
> > wrote:
> >
> > > On Wed, Sep 29, 2021 at 02:17:59PM +1000, Andrew Lemin wrote:
> > > > I see this question died on its arse! :)
> > > >
> > > > This is still an issue for outbound load-balancing over multiple
> internet
> > > > links.
> > > >
> > > > PF's 'sticky-address' parameter only works on source IPs (because it
> was
> > > > originally designed for use when hosting your own server pools -
> inbound
> > > > load balancing).
> > > > I.e. There is no way to configure 'sticky-address' to consider
> > > destination
> > > > IPs for outbound load balancing, so all subsequent outbound
> connections
> > > to
> > > > the same target IP originate from the same internet connection.
> > > >
> > > > The reason why this is desirable is because an increasing number of
> > > > websites use single sign on mechanisms (quite a few different
> > > architectures
> > > > expose the issue described here). After a users outbound connection
> is
> > > > initially randomly load balanced onto an internet connection, their
> > > browser
> > > > is redirected into opening multiple additional sockets towards the
> > > > website's load balancers / cloud gateways, which redirect the
> connections
> > > > to different internal servers for different parts of the site/page,
> and
> > > the
> > > > SSO authentication/cookies passed on the additional sockets must to
> > > > originate from the same IP as the original socket. As a result
> outbound
> > > > load-balancing does not work for these sites.
> > > >
> > > > The ideal functionality would be for 'sticky-address' to consider
> both
> > > > source IP and destination IP after initially being load balanced by
> > > > round-robin or random.
> > >
> > > Just use multipath routing, it will make sure that selected default
> routes
> > > are sticky to source/dest pairs. You may want the states to be
> interface
> > > bound if you need to nat-to on those links.
> > >
> > > On rerouting the multipath code reshuffles the selected routes in a
> way to
> > > minimize the affected sessions. All this is done without any extra
> memory
> > > usage since the hashing function is smart.
> > >
> > > --
> > > :wq Claudio
> > >
> > >
> > > > Thanks again, Andy.
> > > >
> > > > On Sat, Apr 3, 2021 at 12:40 PM Andy Lemin <andrew.lemin@gmail.com>
> > > wrote:
> > > >
> > > > > Hi smart people :)
> > > > >
> > > > > The current implementation of 'sticky-address' relates only to a
> sticky
> > > > > source IP.
> > > > > https://www.openbsd.org/faq/pf/pools.html
> > > > >
> > > > > This is used for inbound server load balancing, by ensuring that
> all
> > > > > socket connections from the same client/user/IP on the internet
> goes
> > > to the
> > > > > same server on your local server pool.
> > > > >
> > > > > This works great for ensuring simplified memory management of
> session
> > > > > artefacts on the application being hosted (the servers do not have
> to
> > > > > synchronise the users session data as extra sockets from that user
> will
> > > > > always connect to the same local server)
> > > > >
> > > > > However sticky-address does not have an equivalent for sticky
> > > destination
> > > > > IPs. For example when doing outbound load balancing over multiple
> ISP
> > > > > links, every single socket is load balanced randomly. This causes
> many
> > > > > websites to break (especially cookie login and single-sign-on style
> > > > > enterprise services), as the first outbound socket will originate
> > > randomly
> > > > > from one of the local ISP IPs, and the users login session/SSO (on
> the
> > > > > server side) will belong to that first random IP.
> > > > >
> > > > > When the user then browses to or uses another part of that same
> website
> > > > > which requires additional sockets, the additional sockets will pass
> > > the SSO
> > > > > credentials from the first socket, but the extra socket connection
> will
> > > > > again be randomly load-balanced, and so the remote server will
> reject
> > > the
> > > > > connection as it is originating from the wrong source IP etc.
> > > > >
> > > > > Therefore can I please propose a "sticky-address for destination
> IPs"
> > > as
> > > > > an analogue to the existing sticky-address for source IPs?
> > > > >
> > > > > This is now such a problem that we have to use sticky-address even
> on
> > > > > outbound load-balancing connections, which causes internal user1 to
> > > always
> > > > > use the same ISP for _everthing_ etc. While this does stop the
> > > breakage, it
> > > > > does not result in evenly distributed balancing of traffic, as
> users
> > > are
> > > > > locked to one single transit, for all their web browsing for the
> rest
> > > of
> > > > > the day after being randomly balanced once first-thing in the
> morning,
> > > > > rather than all users balancing over all transits throughout the
> day.
> > > > >
> > > > > Another pain; using the current source-ip sticky-address for
> outbound
> > > > > balancing, makes it hard to drain transits for maintenance. For
> example
> > > > > without source sticky-address balancing, you can just remove the
> > > transit
> > > > > from the Pf rule, and after some time, all traffic will eventually
> move
> > > > > over to the other transits, allowing the first to be shut down for
> > > whatever
> > > > > needs. But with the current source-ip sticky-address, that first
> > > transit
> > > > > will take months to drain in a real-world situations..
> > > > >
> > > > > lastly just as a nice-to-have, how feasible would a deterministic
> load
> > > > > balancing algorithm be? So that balancing selection is done based
> on
> > > the
> > > > > "least utilised" path?
> > > > >
> > > > > Thanks for your time and consideration,
> > > > > Kindest regards Andy
> > > > >
> > > > >
> > > > >
> > > > > Sent from a teeny tiny keyboard, so please excuse typos.
> > > > >
> > >
> > >
>
>

No comments:

Post a Comment