« Took a diversion from cacti and now its nagiosZoom ADSL X3 5760 & Cacti »

Home server migration ran into some cacti

04/21/13

  10:52:00 pm, by The Dreamer   , 2431 words  
Categories: Software, Computer, Ubuntu, FreeBSD, CFEngine

Home server migration ran into some cacti

I did notice some strangeness in trying to snmpwalk by hand, though I had been fiddling with the snmp_access line in squid, because it didn't match suggested or commented...but had been working...but ubuntu's are running 2.7, and I'm running 3.2.9 on FreeBSD....and I hadn't put any of the FreeBSD squid instances into the old cacti.

I then wondered if there were like newer templates for squid graphing...when I came upon other tools. One of them being squidstats. Which is in ports, so I tried building that, and then spent time trying to figure out how to have cfengine3 manage its configuration, and then have cfengine3 create additional instances on demand to do collection and graphic on other squid instances.

Had trouble figuring out how to have it create edits from the stock files to lay down multiple times as needed. Decided that maybe I had just bit off more than I can chew in cfengine3....and having taken the side trip to graph my Zoom X3 5760 ADSL modem.

I decided that I would just write a script to fetch the OIDs and then create new graphs based and get things going.

I wrote a script that go the 'core' values and that look pretty good, so I moved to doing the 'median' values. That's when things got interesting.

If I went to 3401, I would only get the first set.....if I go through net-snmp proxy, I could get all 3 sets. So, the proxy was needed after all. Doesn't make sense though....

I then created the datasources for all my squid servers, but before setting to work on creating graphs to use them, I checked the update values. I saw some U's, so looked to see why. I then noticed that I was missing a value in the output, which turned out to be from using '<' instead of '<=' in my for loop. And, there were some other U's, because I had typo'd the name of parameter or there was a mismatching between the Data Input Method and the Data Template.

Finally, cacti seemed to think my 'Squid zen' device was down, while it was fine with the regular 'zen' device. Though the RRDs were kind of updating...so I wasn't sure what to make of it at first. Though there were gaps in RRD updates, so I needed to look. I turned out the verbosity, and saw that it was doing two UDP pings to Zen, both failing...and two SNMP pings to Zen, one succeeding and the other sometimes failing. Probably racing the responses and across spine threads.

Since I'm using "Ping or SNMP", I decided to look at fixing the PING problem....

I had noticed that cacti wasn't able to UDP Ping any of my FreeBSD hosts...and I couldn't seem to get spine do ICMP or TCP Ping. I had started poking about in MySQL to see if spine was getting told to try or not....seems it has UDP 23 Ping as a fallback...but got distracted playing around in phpMyAdmin and looking at the performance of my MySQL servers. I had seen MySQL putting hurt on my system...and saw that it was getting a lot more use by cacti then I had thought. While working on how to get phpMyAdmin on zen to talk to mysql on cbox (not sure cbox and cacti should be together....and dbox getting nagios, though maybe it was the right way to go...) phpMyAdmin made some recommendations about my settings on zen, and I tried one or two of them, before I went on to getting it connect to my other mysql servers (though box has phpMyAdmin installed for its mysql server...but probably best having phpMyAdmin on zen rather than on either cbox/dbox which are DMZ hosts....though the phpMyAdmin would be on the local apache, rather than the outside facing nginx reverse proxy servers). I had done zen's mysql as localhost only, but since I was using default config for cbox (and box), it was listening on any.

phpMyAdmin had lots to say about cbox's settings. Considering that zen's was based on mysql-innodb-heavy-4G.cnf (which for some reason has 'default-storage-engine=MYISAM' set)...and its the only one that does of the sample configs. But, the app had explicitedly called for Innodb for its tables.

cbox's mysql is allowed to be with Innodb as its default storage-engine, but cacti tables are explicitly created as MyISAM....and the default config is described as for a system with little memory (32-64MB) where MySQL plays an important part or systems up to 128MB where where MySQL is used other programs (such as a web server).

I made some tweaks to let is use a bit more memory and run with more threads....which seems to have taking the hurt off of it a bit. Though later I'll be putting more hurt on the box.

I think I forgot to look inside the cacti tables ... and went back to looking at why it things 'Squid zen' is down.

I eventually noticed that there was something on my FreeBSD systems that is listening as *:* udp4. Which I eventually tracked down to bsnmpd. Darn, having that prevents udp ping? Discussion is that bsnmp_mibII is the culprit, but its harmless because it doesn't respond to anything. So, its just a sink, so no UDP Ping of FreeBSD systems with it.

I tried commenting out bsnmp_mibII, but bsnmp_hostres requires it, and taking that out, negates the reason I'm proxying net-snmp to bsnmpd.

Plus it didn't make UDP Ping work. Seems rpcbind is listening as *:* on udp6? That means I need to get spine to do ICMP or TCP Ping....I delved in the source code, to try to figure out why TCP Ping wasn't working...when I saw the ICMP code....and eventually decided that I would just make spine setuid root and be done with that. Looks like its written that if icmp isn't available it overrides to PING_UDP....without looking to see if I wanted PING_TCP? Or perhaps there's something that would prevent PING_TCP from being available in my case....?

Anyways...now its ICMP Pinging hosts and all is good.

I then started working on creating graphs and such, and looking at the old templates as reference. During this I noticed that its using 'snmppublic' for community and the ignore box isn't checked. Instead of just 'public' Though it worked in my old cacti....so either there's something weird in older squids that allowed it to work, or the export from 0.8.7e to and import into 0.8.8a altered the template in a bad way.

Oh well, I had datasources...so I was working on graphs. As I worked through, I noticed some strange things about the CDEFs being used, but didn't think much about it until I was done. And, when I was done, I setup the first host...and right away there were some broken graphs or blank ones. I checked the RRDs to see if there was anything there. Well, some of the graphs were always this way before the move, but I never though much about them....until now. So, then I looked at the CDEFs in more detail.

Hmm, it has DEF:a and DEF&#58;&#98;, it has CDEFA make DEF:a zero, and graphs it with no color....its has CDEFB make DEF:a zero, and graphs it with no color. It has these sources after the formula, not that it matters....where the formula is supposed to produce the real graph based on CDEFA and CDEFB. &#58;&#63;&#63;&#58;

How about using a & b instead? Presto...the graph is working. I make a few more tweaks here and there...and then set up graphs for the next host.... I then notice that available FDs is looking strange, thought box shouldn't have that many available, but continue on to doing dbox. Which was really weird. Available FD's was ~10x lower. But, both cbox and dbox have the same settings, so they should be closer together.

I delve into things, and discover that I had typo'd the OIDs for currentUnusedFDescrCnt and currentUnlinkReqs. I fix those and then get to thinking about why are the other FDescrcnts being exposed, those would be more interesting. So, I add those 3 OIDs (reserved, in use and max in used) to the script...but forget to commit it, so it causes problems later when I redo the Data Input Methods/Data Templates....and probably didn't need to delete the data sources after all (requiring me to go through each graph and select new data sources)

Turns out cbox's available FDs was ~10x lower than it should be....making dbox being ~100x lower than what it should've been. With box I was seeing about 2.5x more than I should've. User limits on Ubuntu limit FDs to 1024/proc...while FreeBSD, the default is unlimited, and I had raised the max to 250,000 in /boot/loader.conf. While zen has 32k for maxfiles. I probably meant to make cbox/dbox 32k, but had mixed something up when creating its loader.conf.

But, its all working now....and I'm not going to share the ugly templates and code, because its not how I would've done it now that I've done it. But, not going to go back and do it right/better.

Plus there are still some problems.

But, next up...I wondered if I could make graphs of my MySQLs....

Pages: · 2·

No feedback yet

Now instead of subjecting some poor random forum to a long rambling thought, I will try to consolidate those things into this blog where they can be more easily ignored profess to be collected thoughts from my mind.

Latest Poopli Updaters -- http://lkc.me/poop

bloglovin

There are 20 years 3 months 26 days 18 hours and 54 seconds until the end of time.
And, it has been 4 years 9 months 20 hours 2 minutes and 2 seconds since The Doctor saved us all from the end of the World!

Search

September 2017
Mon Tue Wed Thu Fri Sat Sun
 << <   > >>
        1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30  
Google

Linkblog

  XML Feeds

Who's Online?

  • Guest Users: 1
This seal is issued to lawrencechen.net by StopTheHacker Inc.
powered by b2evolution free blog software

hosted by
Green Web Hosting! This site hosted by DreamHost.

monitored by
Monitored by eXternalTest
SiteUptime Web Site Monitoring Service
website uptime