The Chuck Yerkes Award is presented annually in recognition of outstanding individual contributions in [system administration] online forums. It was created after Chuck Yerkes' untimely death in 2004 to memorialize the mentorship he provided countless systems administrators through his helpful and accurate posts to systems administration mailing lists.
Since 2009, LOPSA (The League of Professional System Administrators) determines and presents the award to someone who followed Chuck's example in their contributions to system administration online forums--whether mailing lists, web forums or chat rooms. 2005-2008 awards were presented by the USENIX Association.
According to alerts coming from IRC, I found that I got the award during the "Opening Remarks and Awards" portion at LISA '13 conference in Washington, DC.
I knew I had been nominated, but hadn't heard anything after that...so I'd be curious to see what get's written up on me officially, etc.
So, the announcement of FreeBSD 9.2 came out on Monday [September 30th], which I missed because I was focused on my UNMC thing. But, once it appeared, I knew that I was going to want to upgrade to it sooner than later.
From its highlights, the main items that caught my attention were:
But, I did start this upgrade on October 4th....where for an unknown reason, I launched the
freebsd-update process on cbox, the busier of the two headless servers. I suspect I went with doing the upgrade on my headless servers, because they are entirely running on SSD and would likely see the benefit of lz4 compression. And, perhaps I did cbox, because it was the system that could most gain from lz4.
It took a couple iterations through
freebsd-update, before I got an upgrade scenario that could proceed. And, it took a long time given the high load that is cbox.
That is cbox is an Atom D2700 (2.13GHz, dual core) processor. And, cacti (especially with the inefficient, processor/memory intensive percona monitoring scripts -- might help if only scrpt server support worked, and wasn't just a left over from what it was based on.) being the main source of load. That is usually in the 11.xx area, except during certain other events (like, since 3.5, when
cf-agent fires...cbox is set to run at a lower frequency than my other systems.) or when the majority of logs get rotated and bzip'd. And, there's also some impact when zen connects to
rsyncd each day for
backuppc. But, these spikes weren't that significant. Though the high load would cause
cf-agent runs to take orders of magnitude longer than other systems, including its 'twin' dbox.
Also ran into a problem (again?) where a lot of the differences that
freebsd-update needed resolved were differences in revision tags....some as silly as '9.2' vs '9.1', others had new time stamps or usernames, but seldom any changes to the contents of the file. Which I then discovered a problem from having some of these files under
cfengine would revert these files back to having '9.1' revision strings, which confused the
freebsd-update. I ended up updating all the files in
cfengine to have the 9.2 versioning, though I thought about just removing/replacing it with something else entirely, though wasn't sure the impact that would have on current/future
Though it did seem to cause problem with the other two upgrades, where it would say that some of these files were now removed and asked if I wanted to remove these. Which doesn't make sense, since it didn't say that with the first upgrade. It was probably just angry that these files already claimed to be from FreeBSD 9.2.
It also didn't like that I use
sendmail, therefore my sendmail configs are specific to my configuration, or that I use
printercap is the one auto-generated by cups, etc.
But, once it got to where it would let me run my first "
freebsd-update install". I ran it, rebooted, ran it again, rebooted, updated stuff (though it didn't complain as much, perhaps because some of the troublesome kernel mod ports had corrected the problem of installing into
/boot/kernel, or perhaps enough stayed the same between 9.1 and 9.2, that things didn't freak out like before. And, this includes the virtualbox kernel mod, when I did the upgrade on zen, and later mew. But, I re-installed these ports and lsof. I did a quick check of other services, and then upgraded the 'zroot' zpool to have feature flags (which now means it no longer has a version, apparently instead of jumping the numbers to distinguish from Sun/Oracle it has eliminated having version numbers (for beyond 28) and having flags for the features added since. Wonder if the flags capture all has changed since 28, since I thought there have been other improvements internal that aren't described by version numbers. Namely, I seem to recall that there have been improvements in recoverability....namely it had been suggested, when I was trying to recover a corrupt 'zroot' on
mew, to try finding a v5000 ZFS live CD. Which I don't think I ever found, and gave up anyways when I concluded the level of corruption was too great for any hope of recovery and that I needed to resort to a netbackup restore, before the last successful full get's expired. Though being that it was nearly 90 days old, the other two month fulls didn't exist due to system instability that eventually caused the corrupted zpool (eventually found to be a known bad revision of the Cougar Point chipset and a bad DIMM...things seem to finally be stable from using a SiI3132 SATA controller instead of the on board, and getting that bad DIMM replaced....was weird that it was a Dell Optiplex 990, purchased new over a year after the problem had been identified and a newer revision of the chipset was released. I did eventually convince Dell support to send me a new motherboard and replace the DIMM. The latter was good, since I had to use DIMMs from another Dell that had been upgraded, so I had less memory for a while. But, while at first I did use the onboard SATA again, eventually I started having problems that would result in losing a disk from the mirrored zpool, to eventually causing a reboot where they would both be present again [though gmirror would need manual intervention]....and moving back to the SiI3132 has finally gotten things stable again. Though the harddrives in mew are SATA-III, so it would've been desirable to have stayed on the SATA-III onboard ports, where it was these ports that were the main source of problems in the prior defective version. Perhaps the fact that the prior version had a heatsink and the new version didn't, wasn't because they didn't need it to try to compensate for the problems caused by over-driving the silicon for the SATA-III portion. But, an oversight with the newer revision motherboard. The problem did tend to occur in the early morning hours on the weekend, when not only is there a lot of daily disk activity, but there is also a lot of weekly disk activity, etc. Oh well.)
So, after upgrading the zpool, and reinstalling the boot block/code. I then rebooted the system again. I had already identified the zfs filesystems where I had 'compression=on', so had written a script to change all these to 'compression=lz4'. Which I now ran.
And, then I turned my attention to doing dbox.
Pages: 1· 2
Hopefully, this isn't a trend... but rather a strange problem that has now been treated (through the use of Muro 128 at night for the last few weeks and to taper off over the next couple...)
Nov 2003: OD -5.00 -1.00 075 OS -8.00 Apr 2005: OD -4.25 -1.00 080 OS -7.00 -0.75 045 May 2006: OD -4.50 -1.00 055 OS -6.50 -1.00 035 Apr 2008: OD -5.25 -1.00 060 OS -7.25 -1.00 025 Aug 2009: OD -5.00 -0.50 050 OS -7.00 -1.25 030 Oct 2010: OD -5.50 -0.75 050 +1.25 OS -7.00 -1.00 040 +1.25 Nov 2011: OD -5.50 -0.75 050 +1.50 OS -7.00 -1.00 040 +1.50 Feb 2013: OD -4.50 -0.75 040 +1.50 OS -6.75 -0.75 010 +1.50 Jun 2013: OD -5.25 -1.00 060 +1.75 OS -7.00 -0.75 040 +1.75 Oct 2013: OD -5.25 -0.50 060 +1.75 OS -7.25 -0.75 015 +1.75
Wonder if there's enough time to get a new pair of glasses online cheap, before my upcoming trip (Narcolepsy Network Conference). And, to use up the remainder of my limited use FSA, which is kind of surprising to have last this long this year. Probably because of the unstable nature of my eyeglass prescription this year, I didn't get what I really wanted in a second pair of glasses this year. Maybe I'll finally get around to it next year....
Just as long as I can see the best that I can for the 50th anniversary of Doctor Who!
This weekends project was to update the skins to 5.x.
From the early 0.8x days of this blog, I had settled on a customized version of the custom skin. Recustomizing it each upgrade was annoying, until I found that I could make my own version of it and it would likely work. Though if there were (bug/security) fixes, it was easier to find out what those were and apply them to my version of the skin.
So, I created an LKC skin for the blog.
This worked surprisingly well, when I upgraded from 4.1.7 to 5.0.5 last weekend. In that I made no changes to any of its files, and it pretty much worked. There was some breakage which I later found was due to some reorganization in global css files due to global css (which I could've fixed by copying the global css files from 4.1.7 down to the skin directory level. But, it was easy enough to fix up some html tags in
index.main.php and "free html widgets". Plus I also removed some other widgets in the process, such as no more Flash Tag Cloud, or the flash twitter widgets (which I guess were broken since the twimg.com incident anyways, and doesn't seem to be available anymore).
This single instance of b2evolution, is also home two a couple other sites now (I used run separate instances, of the heavily customized nature of the early days for this blog, but the work in maintaining them all was a pain, and since they're all with the same hosting provider...going multidomain seemed the better way to go, though it has its challenges.
So after I updated 'LKC', all the code I had changed to get around the css problem needed to be changed back now that it wasn't a problem anymore. Well, it didn't have to be, but the HTML tags I used had been deprecated for quite some time, so it was kind of strange using them again to make things work for a while.
The I turned to the other sites, first is the photoblog site, which is using the included photoblog theme directly...with minor tweaks. I should probably split that off someday. But, only one file changed between 4.1.7 and 5.0.5, though I had pulled up some files from global into it to make some customizations. Though in 5, there's back office means to do the same thing...so to update this skin, I removed those specific customizations and moved the information into the back office. In fact, I'm not sure what if anything I've changed to it for its current appearance. Though there's some things I think could be done better if I had some time to put into it.
Then the other was using 'emerald', which was a 3rdparty skin. I mainly wanted something simple with 3 columns, with the level of customizations that fit my desires at the time. It was originally released for 2.4, but somebody else had updated it for 3.0 or newer. And, while it suffered from similar problems to other old skins that I could work around, I had a desire to make it consistent with 5.x themes. I had checked the forums, and there was one post of somebody who was working on updating their theme which had been based on this to fit 5.x. Though looking at their site, I wouldn't have know it was emerald .... and, there were any details on what he had done to making 5.x...or not sure if it was the issues that I was having.
So, I looked around at other 3 column themes to try. Soon, I decided that I would just use 'evopress', an included theme...and make customizations to it. So I copied it into a different directory and changed
_skin.class.php appropriately. And, then made some code changes, namely to
Now its late, and I have road trip to UNMC tomorrow....
Well, I got the upgrade from b2evolution 4.1.7 to 5.0.5 done today. There had been a few failed starts over the previous few weekends.
I had a plan on how I was going to do it, which was aided the 3 way diffs between my site, the b2evolution-4.1.7 code and the b2evolution-5.0.5 code. Later I did a diff of just my site and the b2evolution-4.1.7 code.
Since it was easier to spot what I had done this way, since pretty much everything in the 5.0.5 side was changed... making it hard for the tool to show where my site differs from the 4.1.7 code.
I did that there was some cruft from previous updates or files that weren't part of the diffs. Perhaps diffs only contained files that had changed between point releases, and omitted files that were new. Or diffs and releases were different on how they handled reorgs. Hmm....
Anyways...in the end it was find what customizations I had done, and apply those changes to the 5.0.5 code. Though I later found that there is now a place in the 5.0.5 code to insert custom data instead of editing the _html_header.inc.php and _body_footer.inc.php. Wonder if I'll go back and try that. Currently, that only affects one skin. The other skins I use, I made copies of so I'll may need to see if they need to be brought up to 5.x. One of the custom skins is based on one that comes with b2evolution, but I've changed it so heavily that it was kind of painful patching it as part of every upgrade....until I went with making it separate. Don't know why I didn't do that with all of them. Though the other skin I may or may not need to update is not one that comes with b2evolution, so it may or may not have been updated for 5.x. Especially, since the current is for 3.x.
Kind of frustrating thing with b2evolution....the lack of current 3rdparty skins and plugins for it.
Well, not yet...though I've been having trouble seeing more and more the last few weeks. And, there was a evidentally measurable change in the right eye.
But, today's exam was to see if I might have Glaucoma. Had gotten a retina imaging test back in February, did another today and things look the same, so probably not Glaucoma. Though today my field of vision testing was bad all over, like a more general loss of vision.
Since the shape of my cornea has changed over the last 3 exams, he suspects its changed again though the test wasn't during the initial testing of the appointment. It got done before I left.
Gonna try some eye drops for 2-3 weeks to see if things get better, worse or not.
On my FreeBSD system, my
apache webserver would get angry whenever I update the
php & extensions ports. Requiring a bunch of other operations after the '
Since I've been playing around with CFEngine 3, I had started to add to my "
bundle agent apache", to do more than just promise config files current, process running, reloads, etc.
So, one of the first problems I had run into on FreeBSD, is that there are certain extensions that need to be in order in '
/usr/local/etc/php/extension.ini'. Which is solved by using
Well, fortunately when this script is run it results in a backup file of '
extensions.ini.old' which is the same age or newer than '
CFEngine3 can take care of it this way:
g.apache is "
apache22" currently on FreeBSD, and "
apache2" on Ubuntu. Someday it might become "
apache24" on FreeBSD.
Since I did FreeBSD first, and I'm still working on getting my one of 4 (or less) Ubuntu rolled in, I have:
g.rc_d as "
g.lrc_d as "
/usr/local/etc/rc.d" for FreeBSD. They are both set to "
/etc/init.d" for Ubuntu. I also have a
g.init_d for Ubuntu, but not FreeBSD. Not sure which I'll use where....I suppose if its an OS specific case,
g.init_d would get used and if its not...then which ever one is the correct one for FreeBSD will get used.
Pages: 1· 2
A couple months ago I asked if mosh could be made to work if the mosh-server IP changes when roaming between networks.
Years ago, I used to have routers that did 'loopback', but haven't had ones capable of it for sometime...or so I thought. Though I hadn't really had a major need for it. Except perhaps for
mosh, MObile SHell, is an ssh replacement that supports roaming and intermittent connectivity. Since I do my IRC using irssi in screen, running all the time on a server at home. This makes staying connected to IRC on my laptop much nicer. I can close my laptop, and later open it and it'll still be connected to my screen session.
The problem was when I came home, I'd be unable to recover the connection correctly and the client goes into an unrecoverable state, so that even if I later use my laptop on an outside network the mosh session won't resume.
But, today I opened my laptop (and I just realized that I didn't do what I had intended to do) and I just minimized the window out the of the way...even though it probably wouldn't recover on Monday at work. But, the dock icon showed that something wanted my attention....probably mosh-client giving up? No. Well, my nick had come up a couple of times yesterday, but it shouldn't have known that....but not really thinking, I switch to the channel. And, it does. I switch around and its working. Wait...it shouldn't be though!
So what changed? I do a tcpdump and see that it is connecting to my WAN IP and getting responses from my WAN IP....'loopback' never worked for me though....
Perhaps its 'loopback' of port forwards that has never worked....
I had moved irssi from box to dbox a while back. The router has two port forwards set related to this to box, a single port TCP forward and port range UDP forward.
But, because my other router is running stock firmware, it has a limited number of port forwards...so as I was migrating services to cbox (and using nginx to reverse proxy web services on other systems on my home network, where those that use a webserver are using apache, including local services...such as cacti on cbox and nagios on dbox), I decided that I would just make cbox the DMZ host...start running host based firewalls at home, especially on this host (it also uses an IP alias...kind of like how we do hosts behind the BigIP at work )
So that means no port forward(s) for my dd-wrt router for WAN to dbox....so I guess the NAT allows 'loopback'ng in this case.
Wonder if the same applies to my other router.
The only problem this causes is that I had plans to replace routers. I actually have a new router to replace the current stock router....though I haven't got anything that really needs to speed upgrade to 802.11ac yet in the room where I using wireless bridging. I also had plans to replace my dd-wrt router, which had started getting unreliable which they seem to do after a while....though it seems to have helped after I deleted old traffic data....
There are Unix servers at work that have uptimes in the >1000 days, there are even servers with updates in the >2000 days, in fact there are servers that have now exceeded 2500 days (I'm looking at one with 2562+ days.)
On one hand there are SAs that see this as a badge of honor or something to have had a system stay up this long. OTOH, its a system of great dread.
A while back this system was having problems....its Solaris and somebody had filled up /tmp....fortunately, I was able to clean things up and recover before another SA resorted to hard rebooting it.
The problem with these long running servers, especially in a ever changing, multi-admin shop, is that you can't be sure that the system will come back up correctly after a reboot.
We've lost a few systems at work due to a reboot. Some significant ones as simple as replacing a root disks under vxvm and forgetting to update the sun partition table, or a zpool upgrade and forgetting to reinstall the boot. To more significant ones, where a former SA had temporarily changed the purpose of an existing system all by command line and running out of /tmp...so that after its been up for 3+ years and he's been gone over a year....patching and rebooting makes it disappear.... the hardware that the system was supposed to be on needed repair, but he had never gotten around to it.
It'll be interesting to see what happens should the system ever get rebooted.
So, what brought this post one?
So, what started as take a week to set up a new nagios server at work ended up taking almost a month...because there were many days where I'd only have an hour or less to put some time into the side task. The other stumbling block was I had decided that the new nagios server configuration files would get managed under subversion, instead of RCS as it had been done in the previous two incarnations. New SA's don't seem to understand RCS and that the file is read-only for a reason...and its not to make them use
:w! ... which lately has resulted in a the sudden reappearance of monitors of systems that had been shutdown long ago.
Though now that I think of it, there used to be the documented procedure for editing zone files (back when it was done directly on the master nameserver and version controlled by RCS.) Which as I recall was to perform an
rcsdiff, and then use the appropriate workflow to edit the zone file.
% rcsdiff zonefile if differences % rcs -l zonefile % ci -l zonefile make rude comment that somebody made edits % vi zonefile % ci -u zonefile else % co -l zonefile % vi zonefile % ci -u zonefile fi
But, when I took over managing DNS servers, I switched to having cfengine manage them and the zone files now live under
masterfiles, so version control is now done using subversion. Had started butchering the DNS section in the wiki, probably should see about writing something up on all the not so simple things I've done to DNS since taking it over...like split, stealth, sed processing of master zone for different views, DNSSEC, the incomplete work to allow outside secondary to take over as master should we ever get a DR site, and other gotchas, like consistent naming of slave zone files now that they are binary.
Additionally work on the nagios at work was hampered by the fact that for Solaris and legacy provisioning is CF2, and the new chef based provisioning is still a work in progress...where I haven't had time to get into any of it yet. So, I had to recreate my CF3 promises for nagios in CF2.
But Friday before last weekend it finally reached the point where it was ready to go live. Though I've been rolling in other wishlist items and smashing bugs in its configuration, and still need to decide what the actual procedure will be for delegating sections of nagios to other groups.
One of the things I had done with new nagios at work, was set up PNP4Nagios...as I had done at home. And, while looking to see if I needed to apply performance tweaks to the work nagios, all the pointers were to have mrtg or cacti collect and plot data from nagiostats. Well, a new work cacti is probably not going to happen anytime soon, and the old cacti(s) are struggling to monitor what they have now (I spent some time a while back trying to tune one them...but its probably partly being hampered by the fact that its mysql can use double the memory that is allocated to the VM. though reducing it from running 2 spine's of 200 threads each...on the 2 CPU VM to a single spine with fewer threads has helped. Something like the boost plugin would probably help in this case, but the version of cacti is pre-PIA. But, it could be a long time before it get's replaced (not sure if upgrade is possible....) Our old cacti is running on a Dell poweredge server that has been out of service over 6 years... with the cacti instance over 8 years old (Jul 8, 2005)....and the OS is RHEL3.
Anyways, it occurs to me that there should be a way to get PNP4Nagios to generate the graphs, and I search around and find check_nagiostats. Though no template for it. Oh, there's a template nagiostats.php, if I create a link for check_nagiostats.php it should get me 'better' graphs. Which is what I have CF2 do at work.
Latest Poopli Updaters -- http://lkc.me/poop
|<< <||> >>|
«doctor who» «tivo hd» «tivo premiere» virtualbox dvd mdadm «instant streaming» «windows xp» progressive orac raid1 10.04lts dsl prescription tivo b2evolution «windows 7» usb lhaven ups «chicago tardis» «hd movie» netflix woot ubuntu boinc amazon.com freebsd voip cfengine3 upgrade box «air purifier» «amazon prime» «powersource 400» ebay tardis cpap twitter replaytv tv «sans digital» eyeglasses linux appletv zen backuppc raid «watch instantly» cox