Monday, April 30, 2018

Re: Troubleshooting rl instability on OpenBSD 6.1

On 04/30/18 18:04, Stuart Longland wrote:
> On 01/05/18 03:00, Solene Rapenne wrote:
>>
>> Stuart Longland writes:
>>
>>> On 29/04/18 18:08, Solene Rapenne wrote:
>>>>
>>>> Stuart Longland writes:
>>>>
>>>>> Hi all,
>>>>>
>>>>> I've got an Advantech UNO-1150G industrial PC running OpenBSD 6.1 acting
>>>>> as an ADSL router, public NTP server and DNS server. dmesg info:
>>>>>
>>>>>> OpenBSD 6.1 (GENERIC) #291: Sat Apr 1 13:49:08 MDT 2017
>>>>
>>>> OpenBSD 6.1 isn't supported anymore, please upgrade.
>>>>
>>>
>>> Upgrade what? The OS, the router? If I'm 100% certain that moving to
>>> 6.2/6.3 will fix rl, then sure, but this answer is not helpful, as I've
>>> been battling this problem for over a month.
>>
>> Maybe your issue is fixed in 6.2 or 6.3, who knows. 6.1 isn't supported
>> anymore and you use it on a router connecting to the Internet. I can
>> only recommend upgrading.
>>
>
> It might conversely also be made worse by 6.2 or 6.3. In theory, it
> shouldn't, but then again, in theory, I shouldn't have been getting this
> problem either.
>
> An update of the OS will have to wait until I can purchase another CF
> card to load with OpenBSD 6.3 and migrate the configuration.
>
> Alternatively, if the problem is hardware, I can just replace the whole
> box. Updating OpenBSD on the existing one would be a waste of time.
>
> I need a way of ruling out the hardware as being an issue. Until then,
> OpenBSD 6.1 stays, unless the debugging facilities in 6.2/6.3 are
> drastically different that make troubleshooting this problem easier.
>
> I think I've tracked down the driver source here:
> https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/dev/ic/rtl81x9.c
> The log suggests it has not changed since the release of OpenBSD 6.1.
>

Here's the thing. There are rules to the game with every OS. With
OpenBSD, if you have to stay up to date -- the support tail is only
about a year long, and that is really only security issues.

So, what are you after? A magic, secret sysctl, "sysctl
rl.work.properly=1" ? Nope, no such thing. Sorry. A patch to fix it?
Not going to happen against 6.1, 6.2, or even 6.3, most likely. -current
is where development happens, only security issues and maybe some
behavior regressions are ever pushed back to old releases...not
operational improvements, new features, or new hw support.

Now, rl chips were considered the worst pieces of network junk around
until the ARM systems started sprouting networking chips. Don't get me
wrong, I've used a lot of them, and had pretty good luck with them, but
a lot of people I respect and who know better than me hate the #$%^ things.

You say a couple things that catch my eye -- 1) 6.1 is over a year old,
and you say you have been battling the problem for a month. So
something changed. That's hinting hw, not sw. (typically. Or the load
changed. or something). 2) you say you had "similar" problems with
another OS. Similar to what, I'm not sure, but that sounds like you
have a HW problem. Keep in mind, when it comes to networks, it's not
just the computer -- the wire and the switch are also all suspect.

But it boils down to this: if you want help on OpenBSD, you play by the
rules and run either -current or at least a supported release (and if
you contend it's an OS issue, you verify it still exists in -current!).
If you don't need OpenBSD help...this isn't the place. And if you can
say with certainty, "everything is the same", you will have no trouble
adding debugging info and figure out your own problem.

Nick.

No comments:

Post a Comment