Tuesday, April 29, 2025

Re: unbound EAGAIN sendto errors

talked too soon... seems it's only using now only one thread...

unbound-control stats_noreset | grep num.queries=
thread0.num.queries=0
thread1.num.queries=0
thread2.num.queries=0
thread3.num.queries=32416
total.num.queries=32416

wtf

G

On 29/04/2025 21:24, Kapetanakis Giannis wrote:
> ok, I've managed to use multiple threads without errors by setting
> so-reuseport: yes
>
> Although the manual says that it's on by default it's not. It is only
> enabled on Linux and Dragonfly.
> Does it make sense to report this upstream and enable
> REUSEPORT_DEFAULT on OpenBSD as well?
>
> Other findings just for the record:
>
> net.inet.udp.recvspace / net.inet.udp.sendspace are just default
> values and not max values.
> In unbound these are probably controlled by so-rcvbuf / so-sndbuf, up
> to 2m which is system max.
>
> Thanks for all replies,
>
> G
> ps. It would also help, instead of rdr-to all incoming 53 traffic to
> localhost, to listen on all interfaces
>
> On 29/04/2025 18:11, Otto Moerbeek wrote:
>> On Tue, Apr 29, 2025 at 04:32:01PM +0300, Kapetanakis Giannis wrote:
>>
>>> Can it be related to this: dropped due to no socket?
>> IIRC OpenBSD does handle sending to a non-blocking socket in the
>> multi-threaded case very well. If multiple threads try to do that
>> simulteneously, some of them will get a EAGAIN. To test this
>> hypothesis, try running with a single thread and see if the errors
>> disappear.
>>
>>     -Otto
>>
>>> # netstat -ss|grep -A7 ^udp
>>> udp:
>>>          819120 datagrams received
>>>          15494 with no checksum
>>>          22621 dropped due to no socket
>>>          170569 broadcast/multicast datagrams dropped due to no socket
>>>          625930 delivered
>>>          6497322 datagrams output
>>>          341480 missed PCB cache
>>>
>>> G
>>>
>>> On 29/04/2025 15:36, Kapetanakis Giannis wrote:
>>>> Hi,
>>>>
>>>> I'm having this busy router with multiple private networks behind
>>>> which is also doing dns caching services.
>>>> vlans and carp are also involved.
>>>>
>>>> I'm getting this almost every second on my logs from
>>>> multiple/different vlans and IPs (had it also on 7.6 as well on 7.7
>>>> now).
>>>>
>>>> Apr 29 15:11:49 unbound: [18412:1] notice: remote address is
>>>> 10.14.0.196 port 44886
>>>> Apr 29 15:11:54 unbound: [18412:1] notice: sendto failed: Resource
>>>> temporarily unavailable
>>>>
>>>> tcpdump on this shows:
>>>> 15:11:49.420280 10.14.0.196.44886 > 10.14.0.1.domain: 13+ [2au] A?
>>>> google.com.(74) (DF) [tos 0xe0]
>>>> 15:11:52.519896 10.14.0.196.44886 > 10.14.0.1.domain: 13+ [2au] A?
>>>> google.com.(74) (DF) [tos 0xe0]
>>>> 15:11:52.520048 10.14.0.1.domain > 10.14.0.196.44886: 13 FormErr-
>>>> 0/0/2(74)
>>>>
>>>> I believe I'm hitting some kind of limit either in the OS or in
>>>> unbound.
>>>>
>>>> What I have and tested so far:
>>>> kern.maxfiles=32768
>>>>
>>>> login.conf:
>>>> unbound:\
>>>>     :openfiles=32768:\
>>>>     :tc=daemon:
>>>>
>>>> unbound.conf:
>>>>     num-threads: 4
>>>>     num-queries-per-thread: 4096
>>>>     outgoing-range: 16384
>>>>     so-rcvbuf: 2m
>>>>     so-sndbuf: 2m
>>>>
>>>> no luck so far.
>>>>
>>>> pf states ~ 30K (hard limit 200K)
>>>> load 1.2 (mostly by pmacctd)
>>>> hw.machine=amd64
>>>> hw.model=Intel(R) Xeon(R) CPU X5660 @ 2.80GHz
>>>> hw.ncpu=6
>>>>
>>>> Any ideas?
>>>>
>>>> Thanks,
>>>>
>>>> G
>>>>
>>>>
>>>>
>>>>
>

No comments:

Post a Comment