Non-blocking (async) DNS resolving in Java

GreyCat picture GreyCat · Aug 14, 2012 · Viewed 9.6k times · Source

Is there a clean way to resolve a DNS query (get IP by hostname) in Java asynchronously, in non-blocking way (i.e. state machine, not 1 query = 1 thread - I'd like to run tens of thousands queries simultaneously, but not run tens of thousands of threads)?

What I've found so far:

  • Standard InetAddress.getByName() implementation is blocking and looks like standard Java libraries lack any non-blocking implementations.
  • Resolving DNS in bulk question discusses similar problem, but the only solution found is multi-threaded approach (i.e. one thread working on only 1 query in every given moment of a time), which is not really scalable.
  • dnsjava library is also blocking only.
  • There are ancient non-blocking extensions to dnsjava dating from 2006, thus lacking any modern Java concurrency stuff such as Future paradigm usage and, alas, very limited queue-only implementation.
  • dnsjnio project is also an extension to dnsjava, but it also works in threaded model (i.e. 1 query = 1 thread).
  • asyncorg seems to be the best available solution I've found so far targeting this issue, but:
    • it's also from 2007 and looks abandoned
    • lacks almost any documentation/javadoc
    • uses lots of non-standard techniques such as Fun class

Any other ideas/implementations I've missed?

Clarification. I have a fairly large (several TB per day) amount of logs. Every log line has a host name that can be from pretty much anywhere around the internet and I need an IP address for that hostname for my further statistics calculations. Order of lines doesn't really matter, so, basically, my idea is to start 2 threads: first to iterate over lines:

  • Read a line, parse it, get the host name
  • Send a query to DNS server to resolve a given host name, don't block for answer
  • Store the line and DNS query socket handle in some buffer in memory
  • Go to the next line

And a second thread that will:

  • Wait for DNS server to answer any query (using epoll / kqueue like technique)
  • Read the answer, find which line it was for in a buffer
  • Write line with resolved IP to the output
  • Proceed to waiting for the next answer

A simple model implementation in Perl using AnyEvent shows me that my idea is generally correct and I can easily achieve speeds like 15-20K queries per second this way (naive blocking implementation gets like 2-3 queries per second - just the sake of comparison - so that's like 4 orders of magnitude difference). Now I need to implement the same in Java - and I'd like to skip rolling out my own DNS implementation ;)

Answer

andersoj picture andersoj · Aug 18, 2012

It may be that the Apache Directory Services implementation of DNS on top of MINA is what you're looking for. The JavaDocs and other useful guides are on that page, in the left-hand side-bar.