Bug 1174

Summary: 4.1.14-rc1 nsd-control stats_noreset takes 14 seconds to respond
Product: NSD Reporter: Jared Mauch <jared>
Component: NSD CodeAssignee: NSD team <nsd-team>
Status: ASSIGNED ---    
Severity: enhancement CC: wouter
Priority: P5    
Version: 4.1.x   
Hardware: Other   
OS: All   

Description Jared Mauch 2016-12-04 17:15:29 CET
When running 4.1.14-rc1 as built from SVN after a few hours of running and issuing /sbin/nsd-control reconfig via cron every 5 minutes to pick up any configuration changes in the automated config file generation process it seems the nsd process is unhappy with the stats_noreset request.

Relevant logs:

Dec  4 11:08:09 puck nsd[8046]: new control connection from 127.0.0.1
Dec  4 11:08:09 puck nsd[8046]: remote control connection authenticated
Dec  4 11:08:09 puck nsd[8046]: control cmd:  stats_noreset
Dec  4 11:08:20 puck nsd[8046]: zone lechner-rau.de serial 2016110316 is updated to 2016110317.
Dec  4 11:08:20 puck nsd[8046]: remote control stats printed
Dec  4 11:08:30 puck nsd[8046]: new control connection from 127.0.0.1
Dec  4 11:08:30 puck nsd[8046]: remote control connection authenticated
Dec  4 11:08:30 puck nsd[8046]: control cmd:  stats_noreset
Dec  4 11:08:39 puck nsd[4799]: notify for mytfg.de. from 178.15.12.45 serial 2016112806
Dec  4 11:08:45 puck nsd[8046]: remote control stats printed

I have verbosity set to 5 right now, and it doesn't seem to exist after a process restart.

Detailed stats as well:

# time nsd-control stats_noreset
server0.queries=33391048
server1.queries=33256550
num.queries=66647598
time.boot=170246.099484
time.elapsed=170246.099484
size.db.disk=0
size.db.mem=147852613
size.xfrd.mem=49830576
size.config.disk=0
size.config.mem=6740200
num.type.TYPE0=5
num.type.A=35064995
num.type.NS=828687
num.type.MD=0
num.type.MF=0
num.type.CNAME=79871
num.type.SOA=232319
num.type.MB=2
num.type.MG=0
num.type.MR=2
num.type.NULL=0
num.type.WKS=0
num.type.PTR=2207275
num.type.HINFO=51
num.type.MINFO=0
num.type.MX=2732094
num.type.TXT=632252
num.type.RP=2
num.type.AFSDB=596
num.type.X25=52
num.type.ISDN=0
num.type.RT=0
num.type.NSAP=0
num.type.SIG=0
num.type.KEY=2
num.type.PX=0
num.type.AAAA=23499092
num.type.LOC=123
num.type.NXT=195
num.type.TYPE31=277
num.type.SRV=405700
num.type.NAPTR=4584
num.type.KX=0
num.type.CERT=2
num.type.TYPE38=11976
num.type.DNAME=2
num.type.OPT=0
num.type.APL=0
num.type.DS=25141
num.type.SSHFP=468
num.type.IPSECKEY=0
num.type.RRSIG=236
num.type.NSEC=2497
num.type.DNSKEY=671688
num.type.DHCID=0
num.type.NSEC3=2
num.type.NSEC3PARAM=386
num.type.TLSA=4982
num.type.CDS=0
num.type.CDNSKEY=0
num.type.OPENPGPKEY=0
num.type.CSYNC=0
num.type.SPF=121745
num.type.TYPE103=3
num.type.NID=0
num.type.L32=0
num.type.L64=0
num.type.LP=0
num.type.EUI48=0
num.type.EUI64=0
num.type.TYPE251=1895
num.type.TYPE252=567
num.type.TYPE255=108720
num.opcode.QUERY=66626212
num.opcode.NOTIFY=12936
num.class.IN=66637549
num.class.CH=1599
num.rcode.NOERROR=29176993
num.rcode.FORMERR=224
num.rcode.SERVFAIL=333431
num.rcode.NXDOMAIN=5613090
num.rcode.NOTIMP=8212
num.rcode.REFUSED=31421009
num.rcode.YXDOMAIN=0
num.edns=59742684
num.ednserr=62
num.udp=54662106
num.udp6=11890905
num.tcp=73056
num.tcp6=21531
num.answer_wo_aa=363406
num.rxerr=0
num.txerr=8
num.raxfr=10
num.truncated=103704
num.dropped=44
zone.master=70
zone.slave=9101
0.058u 0.006s 0:14.73 0.3%	0+0k 0+0io 0pf+0w
Comment 1 Wouter Wijngaards 2016-12-05 09:17:44 CET
Hi Jared,

This stats things makes NSD perform a couple of forks.  Nothing else really; but if that takes too much time.  Your system is overloaded, forks take a very long time.  Perhaps also if reload takes more memory it is exited as the system runs out of memory?  That could explain the other bug.

Note that in any case, even if that took 14 seconds, NSD kept responding to queries (another process responds to port 53).

Best regards, Wouter