Bug 741 - dnstap is non-functional
dnstap is non-functional
Status: RESOLVED FIXED
Product: unbound
Classification: Unclassified
Component: server
1.5.7
x86_64 Linux
: P5 normal
Assigned To: unbound team
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2016-01-27 16:06 CET by Igor Novgorodov
Modified: 2016-01-27 20:06 CET (History)
3 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Igor Novgorodov 2016-01-27 16:06:33 CET
Hello, dear developers! :)

I've built Unbound with dnstap support, according to logs it should have created a dnstap socket after startup:
[1453906769] unbound[9812:0] notice: opening dnstap socket /tmp/dnstap.sock
[1453906769] unbound[9812:0] notice: dnstap Message/CLIENT_QUERY enabled

But it does not:
# ls /tmp/dnstap.sock

If i create (and listen to) it manually with socat and then start Unbound - i can not see anything coming from there.

Config:

server:
    verbosity: 1
    num-threads: 4

    username: root
    do-daemonize: yes

    logfile: "unbound.log"

    use-syslog: no

dnstap:
    dnstap-enable: yes
    dnstap-socket-path: "/tmp/dnstap.sock"
    dnstap-log-client-query-messages: yes

Am i doing something wrong? There's so few documentation about this subsystem.

fstrm & protobuf-c & protobuf are latest from git, tried earlier versions - no difference.

Thanks!
Comment 1 Wouter Wijngaards 2016-01-27 16:08:36 CET
Hi Igor,

Does your file system allow named pipes?  Not all of them do, for example, the default OSX filesystem (on Apple) does not allow the dnstap socket to be created ...  But its memory backed fs may allow it, I believe.

Best regards, Wouter
Comment 2 Igor Novgorodov 2016-01-27 16:11:54 CET
I have no problems creating pipes by hand (mkfifo) or with socat. Unbound's control socket is also in /tmp and works fine. Filesystem is traditional ext4, so it should not be a problem in any way...
Comment 3 Wouter Wijngaards 2016-01-27 16:17:47 CET
Hi Igor,

From what I can see this should just work ... it looks fine, I mean, unbound looks like it is putting messages in fstrm/protobuf.

So what is happening is that the .sock file does not exist?
this is dnstap/dnstap.c:145:
        fstrm_unix_writer_options_set_socket_path(fuwopt, socket_path);
        fw = fstrm_unix_writer_init(fuwopt, fwopt);
and this seems to succeed for unbound, something inside fstrm?

Best regards, Wouter
Comment 4 Wouter Wijngaards 2016-01-27 16:19:22 CET
Hi Igor,

Does fstrm delay socket creation until the first query arrives?  Not sure if it does?

Best regards, Wouter
Comment 5 Igor Novgorodov 2016-01-27 16:24:55 CET
It should be created after startup - the log message is printed in dt_create() function which creates socket: 

verbose(VERB_OPS, "opening dnstap socket %s", socket_path);
...
fw = fstrm_unix_writer_init(fuwopt, fwopt);
log_assert(fw != NULL);

and as assert does not trigger - it should execute fine... but somehow it doesn't.

Anyway i've tried to query the server - it works fine, but no dnstap socket.
Maybe unbound requires some specific fstrm version?
Comment 6 Wouter Wijngaards 2016-01-27 16:27:00 CET
Hi Igor,

I used fstrm 0.2.0 when I tested it.

strace may reveal syscalls (used to create the pipe)?

Best regards, Wouter
Comment 7 Igor Novgorodov 2016-01-27 16:29:58 CET
I've tried the same 0.2.0 and git version (which are the same i presume).
Tried strace already, it does not show anything about .sock or dnstap at all.
Maybe i should try gdb, but i'm not an expert in it :)
Comment 8 Wouter Wijngaards 2016-01-27 16:30:33 CET
Hi Igor,

The author of libfstrm, Robert Edmonds, contributed the dnstap code for Unbound.  Perhaps, if there seems no other unbound related component here, you could talk to him, since he would know both libfstrm and unbound's dnstap very well?  Also, no socket created gives me the impression of a bug in fstrm.

Oh wait, you are not chrooted are you?  Because then the socket is created inside the chroot of course!

Best regards, Wouter
Comment 9 Wouter Wijngaards 2016-01-27 16:31:28 CET
Even more so, chroot is enabled by default, so you need chroot: "" to disable it.

Best regards, Wouter
Comment 10 Robert Edmonds 2016-01-27 17:15:04 CET
Hi, Igor:

The dnstap code in the DNS server (like Unbound), functions as a client rather than a server. It needs a server on the other end of the socket to accept connections. I did it this way instead of the other way around so that a single socket could be used to accept messages from several dnstap senders. This is similar to how a syslog daemon listens for log messages from multiple processes at a well-known AF_UNIX socket like /dev/log.

The server has to handshake with the client so that the client knows it is connected to the right kind of server before it will begin sending frames. You cannot use mkfifo because the connection is not a named pipe, and unless you can send the binary sequences used for the handshake frames I doubt socat will work that well either.

If the AF_UNIX socket is not present, or the connection fails somehow, the connection will automatically be reattempted, no often than every 5 seconds. (This is the default, but can be changed with fstrm_iothr_options_set_reopen_interval(), though the Unbound dnstap implementation just uses the default and doesn't expose this as a configuration knob.)

In the fstrm distribution is a utility called fstrm_capture that can be used to serve a capture socket (it uses libevent and can multiplex traffic from multiple connected clients). Usage looks like:

"fstrm_capture -t protobuf:dnstap.Dnstap -u /tmp/dnstap.sock -w /tmp/dnstap.out -ddddd"

Run this command before you start Unbound, otherwise if you start it after Unbound you may need to send queries to the Unbound server (to force data into the fstrm queue) and wait for the reopen interval to expire before Unbound attempts to reconnect.

This will listen on /tmp/dnstap.sock for connections of type "protobuf:dnstap.Dnstap" (the type used by dnstap) and write the output to /tmp/dnstap.out. "-ddddd" will produce very high verbosity trace output.

There is some more relevant info to setting up Unbound and dnstap in the comment on this bug report, when dnstap was originally integrated into Unbound: https://open.nlnetlabs.nl/bugs-script/show_bug.cgi?id=621#c1. (The step #5 is obsolete now, since the dnstap patch is in the mainline source tree.)

There was also a very nice webinar produced last month by Carsten Strotmann and Men and Mice that covers using dnstap with Unbound and a few other servers, available here:

https://www.menandmice.com/resources/educational-resources/webinars/dnstap-webinar/

Hope this helps!
Comment 11 Robert Edmonds 2016-01-27 17:23:38 CET
Also, maybe it would make it more clear what is happening if Unbound logged something like "notice: attempting to connect to dnstap socket ..." instead of "notice: opening dnstap socket ...".
Comment 12 Wouter Wijngaards 2016-01-27 18:13:30 CET
Hi Igor,

Changed, thanks!  Committed that.

Thank you for the assistance Robert!  I was completely flummoxed too.

Best regards, Wouter
Comment 13 Igor Novgorodov 2016-01-27 18:41:51 CET
Ohh, thanks for the very detailed explaination, Robert, it makes perfect sense now! I'll try tomorrow to rotate the gun the other way around :)

P.S.
I have an offtopic, but related question:
If i want to capture dnstap messages from Unbound with some python script - it won't be sufficient to use python's Protocol Buffers module - i'll need some logic for decoding FSTRM's stream first?
Comment 14 Robert Edmonds 2016-01-27 19:34:08 CET
Yes, because protobufs unfortunately have no defined streaming/delimiting protocol, I had to invent one, so it's not possible to take the dnstap output and pass it directly into a protobuf decoder, you need to de-frame the stream first.

If you are just trying to decode a dnstap output file (like the fstrm_capture -w ... output) in Python, that's hopefully not too hard, here is some sample Python code for that:

https://gist.github.com/edmonds/071737d0e9aac530bb9b

You should be able to do something like:

import framestream
# From https://gist.github.com/edmonds/071737d0e9aac530bb9b

import dnstap_pb2
# Need python-protobuf and protobuf compiler output of
# https://raw.githubusercontent.com/dnstap/dnstap.pb/master/dnstap.proto

for frame in framestream.reader(open('/tmp/dnstap.out')):
    d = dnstap_pb2.Dnstap()
    d.ParseFromString(frame)

    # Now 'd' is a parsed dnstap.Dnstap payload.

    # We might be able to pass the raw bytes in d.message.query_message
    # or d.message.response_message (if existing) to the DNS message parser.

This documentation may also be useful:

http://farsightsec.github.io/fstrm/group__fstrm__control.html

If you are looking at developing a complete dnstap capture server that can do the Frame Streams handshake and de-framing and then directly process the resulting protobufs, the handshaking protocol is not too complex, but I haven't thoroughly documented it. Probably you will have to look at the C code for libfstrm:

https://github.com/farsightsec/fstrm/blob/v0.2.0/fstrm/reader.c#L203-L243

https://github.com/farsightsec/fstrm/blob/v0.2.0/fstrm/writer.c#L143-L221

The source code for fstrm_capture may also be useful. That shows how to build an event-driven capture server that can process data from multiple dnstap clients.

There's also a dnstap mailing list (http://lists.redbarn.org/mailman/listinfo/dnstap) that you are welcome to join for general dnstap development discussion.

Hope this helps!
Comment 15 Igor Novgorodov 2016-01-27 20:06:40 CET
Thank you very, very much!