Bug 639

Summary: server.c select() is hit by __FD_SETSIZE when listening on many interfaces
Product: NSD Reporter: matt <matt.singh>
Component: NSD CodeAssignee: NSD team <nsd-team>
Status: ASSIGNED ---    
Severity: normal CC: willem
Priority: P5    
Version: 4.1.x   
Hardware: x86_64   
OS: Linux   

Description matt 2015-01-07 13:01:33 CET
I have a configuration file that is 600 lines long.  It contains 565 ipv4/ipv6 anycast addresses, 2 unicast addresses that nsd listens on.  I have changed  max-ips within the nsd-3.2.18 source to 2048, re-compiled and re-installed.
When adding an include directive to one of the zones like;

zone:
        name:           "abc"
        zonefile:       "abc"
        include         "/export/dns/master.conf"

Then when querying for the zone using dig, no results are returned.
If I remove the include directive, the same query is successful.
Comment 1 Willem Toorop 2015-01-08 17:13:31 CET
Hi Matt,

The include directive needs a colon, but I assume that this is a typo.

Is the content of /export/dns/master.conf indented?

With includes processing continues as if the text from the included file was copied into the config file at that point.  So if directives in /export/dns/master.conf are not indented, then its content is not part of the "abc" zone section.

Regards,
-- Willem
Comment 2 matt 2015-01-27 17:11:18 CET
Yes that include directive did have a typo when typing it into this bug report.  It was typed correctly into actual configuration.
I have removed the include so I have flat zones;

key:
        name:           tsigkey
        algorithm:      hmac-md5
        secret:         "xxxxx"

zone:
        name:           "abc"
        zonefile:       "abc"
        request-xfr:    1.2.3.4 tsigkey

zone:
        name:           "xyz"
        zonefile:       "xyz
        request-xfr:    1.2.3.4 tsigkey

zone:
        name:           "ghj"
        zonefile:       "ghj"
        request-xfr:    1.2.3.4 tsigkey

If I have 510 or more ip-address directives in the server section of the config file, the dns server stops responding to dns queries against those configured zones with a SERVFAIL.
Comment 3 Willem Toorop 2015-01-28 11:42:14 CET
(In reply to matt from comment #2)
> If I have 510 or more ip-address directives in the server section of the
> config file, the dns server stops responding to dns queries against those
> configured zones with a SERVFAIL.

That is an interesting number... maybe it is the size of the fd_set that causes problems.  On what system are you building?  Could you compile and run this little C program on your system to detect the fd_set size?

#include <sys/select.h>
void main ()
{
printf("%d\n", (int)sizeof(fd_set));
}

You might want to have a look at nsd4 configured with the --with-libevent option.
Regards,

-- Willem
Comment 4 matt 2015-01-28 13:42:50 CET
I am building on CentOS release 6.5 (Final) 2.6.32-431.el6.x86_64
The output of the program is 128.
Comment 5 Willem Toorop 2015-01-28 14:00:21 CET
(In reply to matt from comment #4)
> I am building on CentOS release 6.5 (Final) 2.6.32-431.el6.x86_64
> The output of the program is 128.

Which would fit 1024 file descriptors in a fd_set (provided there value < 1024 (likely)), which would be enough.  It is also not immediately clear to me why that would result in SERVFAIL responses.  That would mean that communication (i.e. select) at least was working.  Sorry, thinking out loud here...

I guess I have to try to reproduce your problem myself.

So the number of zones does not matter?
Only the number of ip-address directives, correct?

Also, in your fist comment you said: "I have changed  max-ips within the nsd-3.2.18 source to 2048,"
Does that mean you have run configure with --with-max-ips=2048 as argument?

Thanks for the feedback,
-- Willem
Comment 6 matt 2015-01-28 14:27:42 CET
So the number of zones does not matter?
- Yes the number of zones does not matter

Only the number of ip-address directives, correct?
- Yes

Does that mean you have run configure with --with-max-ips=2048 as argument?
- Yes

Thanks
Comment 7 Willem Toorop 2015-01-28 16:00:18 CET
Hi Matt,

As Wouter just pointed out to me, for each ip-address two sockets are opened, one for UDP and one for TCP, so we probably do have the issue with the limited size of fd_set and the fact that nsd3 uses select.

Of course nsd3 should in such cases just exit with a descriptive error text instead of returning SERVFAIL (which I still don't understand), but that wouldn't solve it for you.

Again, I strongly recommend you to try to use nsd4 configured with the --with-libevent option.

Regards,

-- Willem
Comment 8 matt 2015-02-17 15:12:23 CET
Have upgraded to nsd 4.1.1 and am now seeing the same error in the already reported bug

"Bug 615 - NSD fails to write slave zone files".
Zones are successfully xfr'd from master server and served out of the NSD process (into the xfrdir) on the slave server, but zone files are never written to disk.

Verbosity level 2

[2015-02-17 14:08:25.741] nsd[23530]: notice: nsd started (NSD 4.1.0), pid 23529
[2015-02-17 14:08:26.359] nsd[23529]: info: xfrd: zone abc committed "received update to serial 5124 at 2015-02-17T14:08:26 from 10.10.10.10 TSIG verified with key xxxxxx"

No errors reported, but the xfr's are all building up in /tmp/nsd-xfr-23529 in binary form.  xfr.0, xfr.1, xfr.2 etc etc
Comment 9 Willem Toorop 2015-02-17 16:07:17 CET
Hi Matt,

Does nsd process actually serve the zones for which it is secondary?
Can you query for the names?

That the xfr.* are not deleted (and maybe even not processed) is problematic.
Maybe the reload process crashed?  Do you have a core dump we can inspect?  Or more info from the log files?

Regards,
-- Willem
Comment 10 matt 2015-02-17 17:42:20 CET
I have compiled nsd 4.1.1 with max-ips of 2048.

The only queries it responds to are "CH TXT id.server" and "version.bind" when zone transfers have not occured.

It does not serve the zones for which it is secondary, but I managed to get zone transfers to work by going in a bit of a circle.

If I have 504 or more ip-address statements in the nsd conf file, zone transfers are not written to disk.  If I manually copy the zone file over into /var/lib/nsd, restart nsd, it will serve requests for the zone.
If I have 503 or less ip-address statements, zone transfers are written to disk and the zone is served correctly.

Seems to be another limit that I'm overlooking?
Comment 11 matt 2015-02-17 18:26:03 CET
Found out some more strange behaviour;

If I have a server count:8
 then I can have 503 ip-address statements

If I have a server count:1
 then I can have 506 ip-address statements

When zone transfers are working
[2015-02-17 16:36:04.921] nsd[19771]: warning: /var/lib/nsd/nsd.db: not cleanly closed 0
[2015-02-17 16:36:04.921] nsd[19771]: warning: can not use /var/lib/nsd/nsd.db, will create anew
[2015-02-17 16:36:04.991] nsd[19771]: info: zone abc read with no errors
[2015-02-17 16:36:04.991] nsd[19771]: info: rehash of zone abc. with parameters 1 0 1 -
[2015-02-17 16:36:04.991] nsd[19771]: info: zone abc written to db
[2015-02-17 16:36:05.007] nsd[19771]: info: zone xyz read with no errors
[2015-02-17 16:36:05.007] nsd[19771]: info: rehash of zone xyz. with parameters 1 0 1 -
[2015-02-17 16:36:05.020] nsd[19771]: info: zone xyz written to db

Directory listing
-rw-------   1 nsd  nsd  602013696 Feb 17 16:33 nsd.db

When zone transfers are NOT working

[2015-02-17 16:30:33.457] nsd[19287]: info: zonefile abc does not exist
[2015-02-17 16:30:39.868] nsd[19286]: info: xfrd: zone abc committed "received update to serial 3000105505 at 2015-02-17T16:30:39 from 10.10.10.10 TSIG verified with key xxxxxx"

Directory listing
-rw-r--r--   1 nsd  nsd  28019 Feb 17 16:31 ixfr.state
-rw-------   1 nsd  nsd  18432 Feb 17 16:31 nsd.db

There are not core dumps.
Comment 12 matt 2015-04-08 17:33:01 CEST
I have got around the problem by increasing the __FD_SETSIZE from 1024 to 2048 and recompiled nsd4 in this environment.

I can now have 562 ip-address statements and zone transfers are written to disk like normal.

#####
#include <sys/types.h>
#undef __FD_SETSIZE
#define __FD_SETSIZE 2048
#include <sys/select.h>
#####

But I wonder at what cost.  I hope there will be no memory leaks or further issues, so I am going to try it on a test system.
Comment 13 Willem Toorop 2015-04-08 21:04:11 CEST
(In reply to matt from comment #12)
> I have got around the problem by increasing the __FD_SETSIZE from 1024 to
> 2048 and recompiled nsd4 in this environment.
> 
> I can now have 562 ip-address statements and zone transfers are written to
> disk like normal.
> 
> #####
> #include <sys/types.h>
> #undef __FD_SETSIZE
> #define __FD_SETSIZE 2048
> #include <sys/select.h>
> #####
> 
> But I wonder at what cost.  I hope there will be no memory leaks or further
> issues, so I am going to try it on a test system.

Hi Matt,

This is interesting.

You say you have configured nsd4 with the --with-libevent option, but that did not resolve the problems you have when listening on more then 500 ip addresses?

And increasing the __FD_SETSIZE does, even though you had already build against libevent (which should have dealt with the problem in the first place)?

Increasing that constant feels quite dangerous...

-- Willem
Comment 14 matt 2015-04-09 12:26:07 CEST
I can confirm I have compiled nsd with the  --with-libevent option.

ldd /usr/sbin/nsd
        linux-vdso.so.1 =>  (0x00007fff045ff000)
        libssl.so.10 => /usr/lib64/libssl.so.10 (0x0000003587a00000)
        libcrypto.so.10 => /usr/lib64/libcrypto.so.10 (0x0000003586600000)
        libevent-1.4.so.2 => /usr/lib64/libevent-1.4.so.2 (0x0000003a49200000)
        librt.so.1 => /lib64/librt.so.1 (0x0000003aff400000)
        libc.so.6 => /lib64/libc.so.6 (0x000000357fe00000)
        libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x0000003586e00000)
        libkrb5.so.3 => /lib64/libkrb5.so.3 (0x0000003583a00000)
        libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000003582e00000)
        libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x0000003584e00000)
        libdl.so.2 => /lib64/libdl.so.2 (0x0000003580600000)
        libz.so.1 => /lib64/libz.so.1 (0x0000003580a00000)
        libnsl.so.1 => /lib64/libnsl.so.1 (0x0000003583600000)
        libresolv.so.2 => /lib64/libresolv.so.2 (0x0000003581a00000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003580200000)
        /lib64/ld-linux-x86-64.so.2 (0x000000357fa00000)
        libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x0000003584a00000)
        libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000003585200000)
        libselinux.so.1 => /lib64/libselinux.so.1 (0x0000003581600000)

The problem was that upon upgrading to nsd4 (with a server count:4), it was listening on 504 or more ip addresses and responding to dns queries ok.  But full zone transfers were not being written to disk or any zone updates were not being written to disk unless I descreased to 503 or less ip addresses.
The XFRs would be building up in /tmp/nsd-xfr-23529 in binary form.  xfr.0, xfr.1, xfr.2 etc etc and stay there.
If I had a lesser server count like "server count:1" then I could have 506 ip addresses with zone transfers working.

Increasing __FD_SETSIZE from 1024 to 2048 and recompiled nsd4 with libevent means that the zone transfers/updates are now being written to disk and /tmp/nsd-xfr-23529 is empty.  I have 562 ip addresses statements in total.
What people are generally saying is that poll() is a simple drop-in replacement for select() and isn't limited by the 1024 of FD_SETSIZE.
Comment 15 Willem Toorop 2015-04-09 14:01:14 CEST
Hi Matt,

Thank you for your feedback!

Today I learned (from Wouter) that redefining and increasing the __FD_SETSIZE up to a certain limit is an actual feature with some operating systems (including Linux and Windows).  So what you are doing is safe.

The problem is that nsd still uses select for internal communication purposes.  It doesn't need many file descriptors, but that doesn't matter;  The __FD_SETSIZE is a restriction on the actual value of the file descriptor.  As all the lower numbered file descriptors are all in use by listening on the interfaces (UDP and TCP), new file descriptors for internal communication purposes will get higher numbers and are not usable with select any more.

We are still considering how to address this issue.
But already many thanks for reporting and bringing this to our attention!
I will keep this ticket open (but with changed attributes and subject) until we've addressed the issue. OK?

Thanks again!

-- Willem