The stats for my blog, yesterday, were all messed up. With a quite usual amount of unique visits (about 900), the amount of requested pages and of total hits went skyrocketing, totaling more than 400MB of traffic, against an usual ~100 MB (depending on blog posts and if they are picked up by other sites too).
Sure it wasn’t last year’s Slashdot effect bound stats, but they were still quite a bit of a bandwidth being used. AWstats wasn’t picking up any new bot, not a specific single IP biasing the stats, so I had to do some manual analysis to find the cause…
There are two possible culprits, one is a German IP (reporting Opera as user agent), which seemed to refresh my last post on Gentoo’s “issue” constantly starting from 9 AM till 10 PM. Seems a legit request, although I’d suggest that reader, if (s)he’s reading this, to use the RSS feed for the comments instead, that will save their and mine bandwidth 😉
The other clearly is a bot, as it advertise itself as such: “Yeti/0.01 (nhn/1noon, email@example.com, check robots.txt daily and follow it)” . The requests from this bot come from a single B class, although mixed with a different “NaverBot” (which points to http://help.naver.com/delete_main.asp in the useragent).
The netblock owner is NHN Corporation, which seems to be the entity behind that Naver site, which seems to be some kind of search engine, likely something similar to Technorati, but my Korean is… well let’s just say the only Asiatic language I can barely understand is Japanese.
I don’t mind indexing, I don’t stop any robot in my robots.txt, and right now bandwidth is far from being a problem (it would have been a very big problem if the blog was still hosted on my home connection though), but they hit the robots.txt file 384 times just yesterday, out of 542 hits total in the day! I’d very much like to write them about this at this point.
So the question would be, am I the only one hit by this “Yeti bot” out there? Any of my readers understand Korean and can tell me what the page linked above for NaverBot says?
Sorry for the service posting.