Web pages and DNS requests

Since Google DNS was announced, I noted that people repeated the Google message:

Complex pages often require multiple DNS lookups before they start loading, so your computer may be performing hundreds of lookups a day.

Is this really true? Well let’s try to break this up in more verifiable keyphrases, starting from the end of the message.

“Your computer may be performing hundreds of lookups a day” is definitely true in general. On the other hand I’d like to note a few things here. Most decent operating systems, and that includes Linux, FreeBSD, OSX and I think Windows as well, tend to cache the DNS results. In Linux and FreeBSD the piece of software that does that is called nscd (Name Service Cache Daemon). If it’s not running, then multiple requests do tend to be quite taxing (I found that out by myself when I forgot to start it in the tinderbox after moving to containers), so set it to auto-start if you didn’t already.

Now it is true that the type of caching that nscd do is quite limited, so it’s definitely not perfect but there is probably another problem there, and that’s well summarised by Paul Vixie’s rant about DNS abuse where he explain how DNS have been abused, among other things, to provide policy results rather than absolute truths. To express policies, it also means that the response cannot really be cached for long periods of time, and that’s definitely a problem for all systems caching, as having a low time-to-live (TTL) means that you have to make at least one full resolution any given time.

And this is probably the main reason why DNS caches, whoever provides them, tend to get more interesting: while they have to make a full recursive resolution every time the TTL expires, it’s most likely that somebody else have requested it already so the common entries (like Google’s hosts) are likely to always be fresh. It still doesn’t solve one problem though: you need multi-level caches, just like the CPU has, since making a (cached) request to an external server is always going to take more time than getting it out of a local cache. Myself to solve the problem I usually have nscd running as well as having pdnsd running on my router to cache the requests (yes that’s why most home routers don’t assign your ISP’s DNS to the single boxes but try to be the main DNS, sometimes failing badly because of bad caches).

But what I really have a problem with is the other half of the message: “Complex pages often require multiple DNS lookups before they start loading”. This is also true of course, but has anybody thought about why is that? Well I sincerely think this is also Google’s own fault!

Indeed even my own blog pages have resources coming from a number of different hosts: Google (for the custom search), PayPal (for the donation button which I restored a couple of weeks ago), Amazon (for the rare associate links) and Creative Commons for the logo at the end of the page; single post pages also have Gravatar images. While it’s probably impossible to just avoid using external resources in a web page, it doesn’t help that a lot of services, including Amazon’s associates website, Google’s Analytics and AdSense have external hostnames: a web page with Google custom search, AdSense and Analytics will have to make three resolutions, rather than one.

And this made me think about some self-cleansing as well. My personal reason to have my website and my blog as separate hostnames is mostly that at the time I first set it up, Typo was hard to properly set on a sub-directory. I wonder if I should now reduce the two of them at just using the flameeyes.eu domain and have a single /blog/ subdirectory for Typo.

But I also noticed other people creating different subdomains for “static” content, which end up requiring further resolutions, for no good reason at all. I think that most people don’t realise that at all (I also haven’t thought much about that before), and I wonder if Google shouldn’t rather start trying to teach people the right way to do things rather than coming up with some strangely boring, badly hyped, privacy-concerning project like the DNS.

8 thoughts on “Web pages and DNS requests

  1. I’ve tried nscd, but I found unscd to be more stable on this machine. Although My mom’s laptop use nscd without a hassle.On the other hand, even without dns cache load times doesn’t seem to be a problem here (images.google.com is light speed). My main problem is with odd names that doesn’t resolve every time, like overlays.gentoo.org, sometimes eix-sync (layman -S) takes forever, that’s why I start dns cache.

    Like

  2. It is true that many people use subdomains for no reason. But using them can actually have a function, i.e.:1) the static content domain uses different http server configuration to improve performance (i.e. doesn’t load mod_php), 2) offloading intensive traffic to a different machine, load ballancing3) improve pageload time on pages with many images (more domains = more parallel requests)As for your site, once you have set up your site’s structure, you shouldn’t change it. For one: this will mean that you will have to have a redirect map of the old paths to new paths, so every request via the old path scheme will first return a 3xx response: so you will replace dns traffic with http traffic. Your blog is famous enough to have many links back to you so you wouldn’t be helping anything imho.

    Like

  3. 4) use of content distribution networks5) cookie-less domains6) different domain = different origin, so the same origin policy will help sandboxing user-generated or external contentAnyway, I agree on the point that caching and sane TTL policies are the real solution to the problem. Actually, when long-lived cookies are not needed, I wonder why they don’t just put in the IP address of the server.

    Like

  4. While Google boasts about it’s DNS, it’s really not any better than OpenDNS when it comes to performance, and certainly doesn’t have the features.Take a look at some interesting stats on Google DNS vs. OpenDNS at http://www.pulsewise.com/bl

    Like

  5. nightbow ~ # emerge -pv nscdThese are the packages that would be merged, in order:Calculating dependencies… done!emerge: there are no ebuilds to satisfy “nscd”.:) Guess it’s not a Gentoo thing.

    Like

  6. Thank you. I’ve been using linux for about ten/twelve years and this is the first time I’ve ever heard of it. Makes me wonder what other basic stuff is out there that I’ve never heard or seen anything about.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s