If you follow my delicious you might have noticed some recently tagged content about Ruby and Gtk+. As you might guess, I’m going to resume working with Ruby and in particular I’m going to write a graphical application using Ruby-Gtk2.
The problem I’mt rying to solve is related to the downtime I had; the problem is that I cannot stay logged in in SSH with top open at any time of the day in my vserver to make sure everything is alright, and thus I ended up having some trouble because a script possibly went haywire (I’m not sure whether it went haywire before or after the move of the vserver to new hardware).
Since using Nagios is a bit of an overkill, considering I have to monitor a single box and I don’t want to keep looking at something (included my email), I’ve decided that the solution is writing a desktop application that will monitor the status of the box and notify me right away that something is not going as it should. Now of course this is a very nice target but a difficult one to achieve, to start with “how the heck do you get the data out of the box?”.
Luckily, for my High School final exam I presented a software that already was a stake to the solution, ATMOSphere (yes I know the site is lame and the project is well dead), which was a software to monitor and configure my router, a D-Link DSL-500 (Generation I) that used as operating system ATMOS (by GlobespanVirata I still have the printed in-depth manuals for the complex CLI interface it had, both serial and telnet protocol based); together with the CLI protocol for setting up basic parameters, I used the SNMP to read most parameters out of it. This is the reason why you might find my name related to a library called libksnmp; that library was a KDE-like interface to the net-snmp library (which was at least at the time a mess to develop with), which I used not only for ATMOSphere, but also for KNetStat to access remote interfaces (like the one of my router); since then I haven’t worked with SNMP at all, albeit I’m sure my current router also supports it.
Despite being called (Anything but — ) Simple Network Management Protocol I’d expect SNMP to be much more often used for querying rather than actually manage, especially considering the bad excuse of an authentication system that was included in the first two versions (not like the one included in version 3 is much better). Also it’s almost certainly a misnomer since the OID approach is probably one of the worst one I’ve seen in my life for a protocol. But beside this, the software is very well present (net-snmp) and nowadays there is a decent client library too, in Ruby, which makes it possible to write monitoring software relatively quickly.
My idea was to just write up something that sits in my desktop tray, querying on a given interval the server for its status, the nice thing here would be being able notify me as soon as there’s a problem, by both putting a big red icon in my tray and by showing up a message through libnotify to tell me what the problem is. This would allow me to know immediately if something went haywire. The problem is: how do you define “there’s a problem”? This is the part I’m trying to solve right now.
While SNMP specifications allows to set errors, so you could just tell snmpd when to report there’s an error, so that it was not the agent but the server to know when to report problems, which is very nice since you just need to configure it on the server and even if you change workstation you’ll have the same parameters; unfortunately this has limited scope: on most routers or SoHo network equipment you won’t find much configuration for SNMP, the D-Link ones, albeit supporting SNMP quite well, didn’t advertise it on the manual nor had configuration options on the wepages, the 3Com I have now has some configuration for SNMP traps and has support for writing through SNMP (luckily, disabled by default); I guess I’ll have to add support for writing at least some parameters so I could set up devices like these (that supports writing to SNMP to set up the alarms). But for those who also lack writing support, I suppose the only way would be to add some support for client-side rules that tells the agent when to issue a warning. I guess that might be a further extension.
Right now I’m a bit at a stop because the version of Ruby-Gtk2 in portage does not support GtkBuilder, which makes writing the interface quite a bit of an issue, but once the new version will be in, I’ll certainly be working on something to apply there. In the mean time, I’m open to suggestions as to other monitoring applications that might save me from writing my own, or in ideas on how I could approach the problems that will present themselves. I think at least I’ll be adding some drop-down widget like the one for the worldclock in Gnome (where the timezones are shown) with graphs of the interface in/out bandwidth use (which would be nice so I could resume monitoring my router too).
Okay for now I suppose I’ll stop here, I’ll wait for the dependencies I’ll need to be in Portage, so maybe someone will find me something better to do and a software that does what I look for.
Using nagios doesn’t mean that you need to use Email notifications. It can be pretty light weight if you need it to be, from passive checks and snmptraps, to any setup of durations that you want, from 1 second to 60 minutes between checks.For the actual alerts, there’s tons of options: IRC, XMPP, SMS, libnotify/xosd and more.
Instead of using SNMP you could write a web interface and use a REST api for your client software to interact with the server. This would give you the added benefit of being able to access the information on a computer that doesn’t have your client software installed.
Why not use gkrellm?
Maybe a Func[1] plugin?[1] https://fedorahosted.org/func
Have you looked at monit instead of nagios for services monitoring?http://www.tildeslash.com/m…It’s pretty easy to configure and can check a lot of things.Also for getting pretty graphs I like cacti.