Bug 660 - interface-automatic broken in the presence of asymmetric routing
interface-automatic broken in the presence of asymmetric routing
Status: RESOLVED FIXED
Product: unbound
Classification: Unclassified
Component: server
unspecified
x86_64 Linux
: P5 normal
Assigned To: unbound team
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2015-04-02 11:51 CEST by Apollon Oikonomopoulos
Modified: 2015-04-02 12:04 CEST (History)
2 users (show)

See Also:


Attachments
[PATCH] Set ipi_ifindex to zero on outgoing packets (5.05 KB, patch)
2015-04-02 11:51 CEST, Apollon Oikonomopoulos
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Apollon Oikonomopoulos 2015-04-02 11:51:03 CEST
Created attachment 274 [details]
[PATCH] Set ipi_ifindex to zero on outgoing packets

Hi,

We had a case were unbound running on Linux would not respond to some clients
in our network over UDP. We used interface-automatic: yes to ease the service
IP failover and unbound was running on a multi-homed system (router). It turns
out that this is due to the way unbound forces the output interface using
IP_PKTINFO. In short, it would suffice to just set ipi_spec_dst to pick the
correct source address for the replies, but unbound also passes ipi_ifindex
which under linux behaves in an interesting way. Long explanation follows:

To provide single-socket UDP multihoming, unbound uses the ancilary
IP_PKTINFO data received during recvmsg(2) and passes it as-is
(including ipi_ifindex) to sendmsg(2). The in_pktinfo structure contains
the following members:

    struct in_pktinfo {
        unsigned int   ipi_ifindex;  /* Interface index */
        struct in_addr ipi_spec_dst; /* Local address */
        struct in_addr ipi_addr;     /* Header Destination
          			      address */
    };

At recvmsg(2) time ipi_ifindex contains the ifindex of the interface the packet
arrived on, while ipi_spec_dst contains the local interface address the packet
matched. Regarding sendmsg(2), man 7 ip states:

    If  IP_PKTINFO  is  passed  to  sendmsg(2)  and ipi_spec_dst  is  not
    zero, then it is used as the local source address for the routing table
    lookup and for setting up IP source route options.  When ipi_ifindex is
    not zero, the primary local address of the interface  specified  by
    the  index overwrites ipi_spec_dst for the routing table lookup.

So it appears as if passing ipi_ifindex should only affect cases where source
routing is performed. However, the actual Linux kernel implementation
(as of 4.0-rc6) states the following (__ip_route_output_key() in
net/ipv4/route.c):

    if (fib_lookup(net, fl4, &res)) {
            res.fi = NULL;
            res.table = NULL;
            if (fl4->flowi4_oif) {
                    /* Apparently, routing tables are wrong. Assume,
                       that the destination is on link.

                       WHY? DW.
                       Because we are allowed to send to iface
                       even if it has NO routes and NO assigned
                       addresses. When oif is specified, routing
                       tables are looked up with only one purpose:
                       to catch if destination is gatewayed, rather than
                       direct. Moreover, if MSG_DONTROUTE is set,
                       we send packet, ignoring both routing tables
                       and ifaddr state. --ANK

                       We could make it even if oif is unknown,
                       likely IPv6, but we do not.
                     */

                    if (fl4->saddr == 0)
                            fl4->saddr = inet_select_addr(dev_out, 0,
                                                          RT_SCOPE_LINK);
                    res.type = RTN_UNICAST;
                    goto make_route;
            }

What this basically does is that ipi_ifindex will override any routing
table decision and force the packet out of that very interface.
Furthermore, if there is no routing table entry for the destination via
that interface, the destination will be assumed to be on-link and will
not be routed via a gateway. Note that no error will be returned to
userspace and if the destination does not respond to an ARP request on
that very link, the packet will be silently dropped.

So, for unbound this means that reply packets on interface-automatic
sockets will always attempt to leave the system from the same
(physical/logical) interface the query came in. This is fine for a
single-homed server, however when running on multi-homed systems (e.g.
on a router with multiple interfaces on a meshed network) there will be
cases of asymmetric routing where the return route to the client goes
through a different interface than the one the query came in and
unbound's replies to not directly connected clients will be silently
dropped.

Since using the correct source address is all that interface-automatic
is about, we should really pass an ipi_ifindex set to 0 to sendmsg(2)
and let the system's routing tables decide the actual interface that
should be used to send the reply, while still retaining the correct
source address in ipi_spec_dst.

The attached patch fixes this for Linux. AFAICT it should not affect other platforms, since ipi_spec_dst alone should be enough to ensure replies carry the correct source address.

Regards,
Apollon

P.S.: This has also been filed as Debian bug #781732 (https://bugs.debian.org/781732)
Comment 1 Wouter Wijngaards 2015-04-02 12:04:22 CEST
Hi Apollon,

Thank you for the detailed bug report.  The patch is committed to the code repository.  I did not know such asymmetric route cases existed and Linux behaved in the manner where the interface number overrides the other information.

I have slightly edited the patch to avoid compiler warnings about punned pointers; but the functionality is identical.

Best regards,
   Wouter