Friday, September 20, 2024

Re: unbound(8) + host(1) + AAAA-only issue

From what I understand, the newer versions of unbound(8) in -current (to be shipped in OpenBSD 7.6) will mask the perceived problem with host(1)?

And the way host(1) now behaves, aborting at the first SERVFAIL, might be intentional due to misbehaving DNS forwarders encountered in the past?

I'm not sure that I would agree to this logic, but I can live with it.


Thanks for everyones help in clarifying what is going on.

Mike


> Am 20.09.2024 um 14:26 schrieb Mike Fischer <fischer+obsd@lavielle.com>:
>
>
>> Am 20.09.2024 um 13:56 schrieb Stuart Henderson <stu.lists@spacehopper.org>:
>>
>> On 2024-09-20, Mike Fischer <fischer+obsd@lavielle.com> wrote:
>>>
>>>> Am 20.09.2024 um 12:13 schrieb Stuart Henderson <stu.lists@spacehopper.org>:
>>>>
>>>>> From what you've shown I can only assume the auth servers are broken
>>>> and probably refusing to respond for A (rather than an empty NOERROR
>>>> response).
>>>
>>> I agree, that is probably the root cause.
>>>
>>> So that would cause host(1) to abort looking for other RRsets? Is that not a bug in host(1)?
>>>
>>> Note: I tried looking at the source code of host(1) but I can't figure out how it works.
>>
>> I think it's generally been fairly common to regard a fqdn (or a fqdn
>> + server combination) as failing if any RRset for that fqdn fails with
>> certain errors.
>>
>> Certainly there have been problems in the past where a client has made
>> an AAAA request, the recursive NS has received no response (usually in
>> this case because the site was using one of the common load-balancing
>> auth servers that were broken in this way) and negatively cached this
>> against the fqdn, then a followup A request has failed.
>
> So you are saying, this behaviour of host(1) is intentional?
>
>
> [snip]
>
>>
>>>> If you show the real hostname, maybe someone can figure it out in
>>>> more detail.
>>>
>>> This is an example hostname I created at dynv6.com for the purpose of figuring out this issue:
>>> test.fwml42.v6.rocks
>>>
>>> $ dig +short test.fwml42.v6.rocks aaaa
>>> 2001:db8::dead:beaf
>>> $ host test.fwml42.v6.rocks
>>> Host test.fwml42.v6.rocks not found: 2(SERVFAIL)
>>
>> Well that's interesting.
>>
>> Querying any of the auth servers directly with host or dig, I do get
>> what looks like a sensible response to A queries
>>
>> $ host test.fwml42.v6.rocks. ns1.dynv6.com.
>> Using domain server:
>> Name: ns1.dynv6.com.
>> Address: 95.216.144.82#53
>> Aliases:
>>
>> test.fwml42.v6.rocks has IPv6 address 2001:db8::dead:beaf
>> $ host -t a test.fwml42.v6.rocks. ns1.dynv6.com.
>> Using domain server:
>> Name: ns1.dynv6.com.
>> Address: 95.216.144.82#53
>> Aliases:
>>
>> test.fwml42.v6.rocks has no A record
>>
>> Testing with unbound 1.20.0 or 1.21.0 and there's no problem.
>>> From unbound (1.18.0) I get various of these,
>>
>> unbound: [93237:0] error: SERVFAIL <test.fwml42.v6.rocks. NS IN>: exceeded the maximum nameserver nxdomains
>> unbound: [93237:0] error: SERVFAIL <test.fwml42.v6.rocks. A IN>: all servers for this domain failed, at zone v6.rocks. from 2a01:4f9:c010:95b:: nodata answer
>> unbound: [71830:1] error: SERVFAIL <test.fwml42.v6.rocks. NS IN>: all servers for this domain failed, at zone v6.rocks. from 95.216.144.82 nodata answer
>>
>> I see this in changelog for 1.19.0 -
>>
>> Fix #946: Forwarder returns servfail on upstream response noerror no data.
>>
>> - the problem this fixes was introduced in 1.18.0 - this doesn't
>> look from the description like it should be directly relevant (as no
>> forwarder is involved), but it seems quite a similar situation.
>> #946 is https://github.com/NLnetLabs/unbound/issues/946
>
> So the dynv6.com NS and unbound(8) in 7.5 stable (Version 1.18.0) may be involved in triggering this?
>
> That leaves the question of how clients such as host(1) should deal with this situation. But that is already being discussed above.
>
>
> Mike

No comments:

Post a Comment