fuzzix dot org

So You've Decided to Implement Zeroconf

23 Mar 2023

Introduction

While working on a distributed computing infrastructure project, I found myself wrangling with the problem of service discovery - how might my nodes find each other on the network without requiring a lot of manual configuration? As time was of the essence, I ended up with a custom, hand-rolled REST service which nodes used to advertise their presence and capabilities to a centralised name service.

Now that time is no longer of the essence, I have decided to revisit this system.

Enter Zeroconf

Or more accurately DNS-based Service Discovery (DNS-SD) with Multicast (mDNS). A full Zeroconf stack also includes IP allocation, with hosts attempting to claim and defend IPs in the link-local address space. Finding a network which doesn't use DHCP or static address allocation is rare, so having a link-local address is often a sign some configuration has failed.

The combination of DNS-SD and mDNS is known as by Apple as Bonjour (they came up with it, so they get to name it). Avahi provides a complete Zeroconf implementation, deployable on a Linux or BSD system which runs D-Bus. On Windows ... well, we'll get to Windows.

DNS-SD and mDNS are deployed widely a number of contexts, especially in home networking. The canonical example is automatic discovery of a networked printer. Running a mDNS browser on my home network, besides a printer I can see the speaker in my kitchen, a bunch of light bulbs, switches & dimmers, as well as any number of network services - SSH, Kodi RPC, file sharing ... the list goes on (and the sysadmin in me shudders). I have also used Bonjour explicitly in the past for RTP-MIDI.

DNS-SD is often conflated with mDNS because of its use in this home-networking context, but DNS-SD may just as easily be deployed using traditional unicast DNS, allowing for internet-scale service discovery and dynamic allocation.

DNS-SD (RFC6763) is built upon a number of extant DNS standards. PTR records, most commonly used for reverse-DNS, can enumerate relevant domains, advertise the presence of a directory of services on the domain, as well as providing this directory of services. Once a desired service has been identified, SRV records provide a list of hosts providing the service, with priority information to assist with failover and capacity planning. A TXT record may provide further basic information about the capabilities of the provided service - you don't want to send an A4 letter to an A0 plotter (this would be a mistake writ large). Finally the host's IP is found with a simple A or AAAA record lookup.

This sounds ideal for my purposes. I want my distributed computing services to be able to advertise their presence, while they browse the network for other components of the system. Dynamically retiring offline nodes and routing requests to new nodes without config changes is also compelling. In the unlikely event the system escapes a single broadcast domain, the DNS-SD component may be upgraded to use unicast DNS.

It may seem tempting now to overload the TXT record with additional service info about specific capabilities and capacity (outside priority provided in SRV records), but once a service is discovered, these specifics should be queried from the service itself.

Learning to Implement and Integrate Zeroconf

Such a popular and widespread protocol suite is surely well described in a number of accessible and easy to read books, right?

To my knowledge there is precisely one book - Zero Configuration Networking: The Definitive Guide (2005) by Daniel H Steinberg and Stuart Cheshire. Perhaps there only needs to be one book - this is the definitive guide after all, and Stuart Cheshire is one of the authors of the mDNS and DNS-SD RFCs!

As far as it goes, this book is pretty good, giving a nice high-level overview of the various moving parts of Zeroconf. There is some code, describing an API with 2005 style concurrency primitives. I'm not sure how much this API has changed in the intervening decades, but with the existence of Avahi, and the Windows situation (I'll get to that, I promise), it is not a complete picture.

I thought Andrew S. Tanenbaum and David J. Wetherall's Computer Networks (2010) might provide further context - it has helped in the past. Of Multicast DNS it says:

224.0.0.251 All DNS Servers on a LAN

...I guess.

I think I'll take the Web Server Gateway Interface (WSGI) approach. WSGI provided a simple interface for web framework authors and web server programmers and administrators. This was a roaring success, and the specification has since been ported to a number of programming languages. Sometimes an idiomatic native implementation is the right solution.

How Much mDNS Can You Do with DNS?

There is a reserved TLD for mDNS, '.local'. On an appropriately configured system, you may look up .local A and AAAA records with any resolver library which delegates lookup to the system.

You may also be able to trick DNS tooling to do mDNS lookups. Older versions of dig could be coerced into performing mDNS lookups by querying the multicast domain explicitly:

$ dig -p 5353 @224.0.0.251 myrouter.local A
...
;; QUESTION SECTION:
;myrouter.local.                        IN      A

;; ANSWER SECTION:
myrouter.local.         10      IN      A       192.168.0.1

;; Query time: 0 msec
...

Though this does not extend to browsing for services:

$ dig -p 5353 @224.0.0.251 _http._tcp.myrouter.local SRV
...
;; connection timed out; no servers could be reached

I think subclassing, or otherwise extending a DNS library may be a good idea in order to take advantage of its response-parsing capabilities. The querying approach diverges from DNS. Instead of requesting a response from a single node, mDNS casts a wide net to capture all capable nodes on a domain, then those which feel themselves responsible for answering the request respond. This may take time, and may require a series of responses to be coalesced, or even a series of requests to be issued.

The Path Forward

I think the following sections describe a reasonable approach to the implementation of mDNS responder services.

Avahi and mDNSResponder

In the presence of an existing service, every effort should be made to integrate with it. Service advertisements should be propagated to and handled by the native system as much as possible. I don't yet know how I might do that with mDNSResponder (i.e. Apple) systems, though I have a path forward for systems running Avahi.

Those familiar with Avahi may know that is opens its UDP socket with SO_REUSEPORT enabled. This means other services may bind to UDP:5353 and advertise their capabilities. This may be disabled with the disallow-other-stacks config option. The man page for this option states:

disallow-other-stacks= Takes a boolean value ("yes" or "no"). If set to "yes" no other process is allowed to bind to UDP port 5353. This effectively impedes other mDNS stacks from running on the host. Use this as a security measure to make sure that only Avahi is responsible for mDNS traffic. Please note that we do not recommend running multiple mDNS stacks on the same host simultaneously. This hampers reliability and is a waste of resources. However, to not annoy people this option defaults to "no".

That is, other mDNS responders may set themselves up in an ad-hoc fashion and I guess we shouldn't stop them by default. The observation that it "hampers reliability" I think understates things a little.

Advertising services on systems using systemd.dnssd should also be possible using D-Bus.

Rolling your own and SO_REUSEPORT

In the absence of an existing responder, I think the only option is to create one. Reading the above, it might seem that we can use a hand-rolled responder alongside Avahi, but this excellent StackOverflow answer details some of the vagaries of SO_REUSEPORT and SO_REUSEADDR.

In short you may be able to reuse the IP and port combination, if your kernel allows it. More recent Linux versions implement "port hijacking" prevention - "All sockets that want to share the same address and port combination must belong to processes that share the same effective user ID".

Once you have bound to your port, again depending on your kernel, you may not actually receive all mDNS requests to the system - your kernel may decide to distribute requests across services bound to the same port.

Though this behaviour might be different if you bind to a multicast address (or rather join a multicast group) - some sources say this will propagate requests to all listeners. Time to hit the books again...

Advanced Programming in the UNIX Environment Third Edition (2013) by W. Richard Stevens and Stephen A. Rago should tell me everything I need to know about socket options. (I have the Second Edition on my bookshelf - check that out if you want to see a cover that has not aged well).

Its index does not mention multicast at all ... nor does it mention SO_REUSEPORT. The section on SO_REUSEADDR is part of a single page and mostly consists of an example regarding a TCP service. So there's another rabbit hole.

For now, binding a hand-rolled responder to a socket and allowing for reuse seems to be the least-worst approach in the absence of a canonical responder service.

Windows

I'm still feeling my way around here, but Windows these days appears to ship with DNS-SD support in the Win32 API, though this blog post on mDNS security issues appears to suggest that services are responsible for implementing their own listeners. Either way, some experimentation is called for.

The path forward for now appears to be to use the existing DNS-SD infrastructure where it exists (i.e. Windows 10+), falling back to the hand-rolled responder where infrastructure is not available. Some installations may have Apple Bonjour installed, but I'm going to call that a rarity and not (yet) consider that setup for inclusion.

Wait, you haven't built it yet?!

Nope! Consider this an experiment in feasibility studies. There's been some research, some exploration of the landscape as it exists, and some marginal experimentation (with much more to come). I know this sort of blog post usually appears when one has a shiny new thing to show off, so let's call this "Part One", with "Part Two" to appear in the next week or in 2026.

If I can shake an implementation out of the above without having to burrow down too many more rabbit holes, I'll be happy enough... Though that's not very likely, is it? 😁

Perhaps this post belongs in a notebook instead, but I don't use a notebook.

Whyeeee?!

Why don't I just use an existing zeroconf library? Because it doesn't completely exist in my ecosystem. There are pieces, and there are complete implementations (with Avahi integration) which won't work for me.

It's space year 2023. I don't want to write IO-bound infrastructure which blocks while awaiting input. I could kick my blocking service to another process, but then instead of a cross-platform multicast DNS headache, I have a cross-platform IPC and process management headache. I know which headache I want.

...and perhaps if I bring it into existence, it will help help enrich the ecosystem just the tiniest amount.

Why is this needed?

The original distributed computing project focussed on wrapping a distributed object framework in an abstraction. Most concerns were eventually hidden behind a single keyword, apart from the sticky business of network configuration. As mentioned in the intro, an ugly second system was bolted on to maintain the illusion.

While the system should accomodate as much flexibility (or rigidity) as required, I think its default operation should be as simple as plugging in a printer, with clients finding the new service with ease ("PC Load Letter" issues notwithstanding).

I plan to go into some detail on this project once I get the service discovery detail worked out. That is, in the next couple months or in 2027.

Conclusion

In conclusion, DNS-SD is a land of contrasts.

Forgive me if I got some detail wrong here - I tried to keep things high-level as this is exploratory work for a side-project. That is, my usual "I am far from an expert on this" disclaimer applies.