Wednesday, September 29, 2021

Re: problems with outbound load-balancing (PF sticky-address for destination IPs)

Hi Claudio,

So you probably guessed I am using 'route-to { GW1, GW2, GW3, GW4 } random'
(and was wanting to add 'sticky-address' to this) based on your reply :)

"it will make sure that selected default routes are sticky to source/dest
pairs" - Are you saying that even though multipath routing uses hashing to
select the path (https://www.ietf.org/rfc/rfc2992.txt - "The router first
selects a key by performing a hash (e.g., CRC16) over the packet header
fields that identify a flow."), subsequent new sessions to the same dest IP
with different source ports will still get the same path? I thought a new
session with a new tuple to the same dest IP would get a different hashed
path with multipath?

"On rerouting the multipath code reshuffles the selected routes in a way to
minimize the affected sessions." - Are you saying, in the case where one
path goes down, it will migrate all the entries only for that failed path
onto the remaining good paths (like ecmp-fast-reroute ?)

Thanks for your time, Andy.

On Wed, Sep 29, 2021 at 5:21 PM Claudio Jeker <cjeker@diehard.n-r-g.com>
wrote:

> On Wed, Sep 29, 2021 at 02:17:59PM +1000, Andrew Lemin wrote:
> > I see this question died on its arse! :)
> >
> > This is still an issue for outbound load-balancing over multiple internet
> > links.
> >
> > PF's 'sticky-address' parameter only works on source IPs (because it was
> > originally designed for use when hosting your own server pools - inbound
> > load balancing).
> > I.e. There is no way to configure 'sticky-address' to consider
> destination
> > IPs for outbound load balancing, so all subsequent outbound connections
> to
> > the same target IP originate from the same internet connection.
> >
> > The reason why this is desirable is because an increasing number of
> > websites use single sign on mechanisms (quite a few different
> architectures
> > expose the issue described here). After a users outbound connection is
> > initially randomly load balanced onto an internet connection, their
> browser
> > is redirected into opening multiple additional sockets towards the
> > website's load balancers / cloud gateways, which redirect the connections
> > to different internal servers for different parts of the site/page, and
> the
> > SSO authentication/cookies passed on the additional sockets must to
> > originate from the same IP as the original socket. As a result outbound
> > load-balancing does not work for these sites.
> >
> > The ideal functionality would be for 'sticky-address' to consider both
> > source IP and destination IP after initially being load balanced by
> > round-robin or random.
>
> Just use multipath routing, it will make sure that selected default routes
> are sticky to source/dest pairs. You may want the states to be interface
> bound if you need to nat-to on those links.
>
> On rerouting the multipath code reshuffles the selected routes in a way to
> minimize the affected sessions. All this is done without any extra memory
> usage since the hashing function is smart.
>
> --
> :wq Claudio
>
>
> > Thanks again, Andy.
> >
> > On Sat, Apr 3, 2021 at 12:40 PM Andy Lemin <andrew.lemin@gmail.com>
> wrote:
> >
> > > Hi smart people :)
> > >
> > > The current implementation of 'sticky-address' relates only to a sticky
> > > source IP.
> > > https://www.openbsd.org/faq/pf/pools.html
> > >
> > > This is used for inbound server load balancing, by ensuring that all
> > > socket connections from the same client/user/IP on the internet goes
> to the
> > > same server on your local server pool.
> > >
> > > This works great for ensuring simplified memory management of session
> > > artefacts on the application being hosted (the servers do not have to
> > > synchronise the users session data as extra sockets from that user will
> > > always connect to the same local server)
> > >
> > > However sticky-address does not have an equivalent for sticky
> destination
> > > IPs. For example when doing outbound load balancing over multiple ISP
> > > links, every single socket is load balanced randomly. This causes many
> > > websites to break (especially cookie login and single-sign-on style
> > > enterprise services), as the first outbound socket will originate
> randomly
> > > from one of the local ISP IPs, and the users login session/SSO (on the
> > > server side) will belong to that first random IP.
> > >
> > > When the user then browses to or uses another part of that same website
> > > which requires additional sockets, the additional sockets will pass
> the SSO
> > > credentials from the first socket, but the extra socket connection will
> > > again be randomly load-balanced, and so the remote server will reject
> the
> > > connection as it is originating from the wrong source IP etc.
> > >
> > > Therefore can I please propose a "sticky-address for destination IPs"
> as
> > > an analogue to the existing sticky-address for source IPs?
> > >
> > > This is now such a problem that we have to use sticky-address even on
> > > outbound load-balancing connections, which causes internal user1 to
> always
> > > use the same ISP for _everthing_ etc. While this does stop the
> breakage, it
> > > does not result in evenly distributed balancing of traffic, as users
> are
> > > locked to one single transit, for all their web browsing for the rest
> of
> > > the day after being randomly balanced once first-thing in the morning,
> > > rather than all users balancing over all transits throughout the day.
> > >
> > > Another pain; using the current source-ip sticky-address for outbound
> > > balancing, makes it hard to drain transits for maintenance. For example
> > > without source sticky-address balancing, you can just remove the
> transit
> > > from the Pf rule, and after some time, all traffic will eventually move
> > > over to the other transits, allowing the first to be shut down for
> whatever
> > > needs. But with the current source-ip sticky-address, that first
> transit
> > > will take months to drain in a real-world situations..
> > >
> > > lastly just as a nice-to-have, how feasible would a deterministic load
> > > balancing algorithm be? So that balancing selection is done based on
> the
> > > "least utilised" path?
> > >
> > > Thanks for your time and consideration,
> > > Kindest regards Andy
> > >
> > >
> > >
> > > Sent from a teeny tiny keyboard, so please excuse typos.
> > >
>
>

No comments:

Post a Comment