So, what started as take a week to set up a new nagios server at work ended up taking almost a month...because there were many days where I'd only have an hour or less to put some time into the side task. The other stumbling block was I had decided that the new nagios server configuration files would get managed under subversion, instead of RCS as it had been done in the previous two incarnations. New SA's don't seem to understand RCS and that the file is read-only for a reason...and its not to make them use
:w! ... which lately has resulted in a the sudden reappearance of monitors of systems that had been shutdown long ago.
Though now that I think of it, there used to be the documented procedure for editing zone files (back when it was done directly on the master nameserver and version controlled by RCS.) Which as I recall was to perform an
rcsdiff, and then use the appropriate workflow to edit the zone file.
% rcsdiff zonefile if differences % rcs -l zonefile % ci -l zonefile make rude comment that somebody made edits % vi zonefile % ci -u zonefile else % co -l zonefile % vi zonefile % ci -u zonefile fi
But, when I took over managing DNS servers, I switched to having cfengine manage them and the zone files now live under
masterfiles, so version control is now done using subversion. Had started butchering the DNS section in the wiki, probably should see about writing something up on all the not so simple things I've done to DNS since taking it over...like split, stealth, sed processing of master zone for different views, DNSSEC, the incomplete work to allow outside secondary to take over as master should we ever get a DR site, and other gotchas, like consistent naming of slave zone files now that they are binary.
Additionally work on the nagios at work was hampered by the fact that for Solaris and legacy provisioning is CF2, and the new chef based provisioning is still a work in progress...where I haven't had time to get into any of it yet. So, I had to recreate my CF3 promises for nagios in CF2.
But Friday before last weekend it finally reached the point where it was ready to go live. Though I've been rolling in other wishlist items and smashing bugs in its configuration, and still need to decide what the actual procedure will be for delegating sections of nagios to other groups.
One of the things I had done with new nagios at work, was set up PNP4Nagios...as I had done at home. And, while looking to see if I needed to apply performance tweaks to the work nagios, all the pointers were to have mrtg or cacti collect and plot data from nagiostats. Well, a new work cacti is probably not going to happen anytime soon, and the old cacti(s) are struggling to monitor what they have now (I spent some time a while back trying to tune one them...but its probably partly being hampered by the fact that its mysql can use double the memory that is allocated to the VM. though reducing it from running 2 spine's of 200 threads each...on the 2 CPU VM to a single spine with fewer threads has helped. Something like the boost plugin would probably help in this case, but the version of cacti is pre-PIA. But, it could be a long time before it get's replaced (not sure if upgrade is possible....) Our old cacti is running on a Dell poweredge server that has been out of service over 6 years... with the cacti instance over 8 years old (Jul 8, 2005)....and the OS is RHEL3.
Anyways, it occurs to me that there should be a way to get PNP4Nagios to generate the graphs, and I search around and find check_nagiostats. Though no template for it. Oh, there's a template nagiostats.php, if I create a link for check_nagiostats.php it should get me 'better' graphs. Which is what I have CF2 do at work.