Virtual Machines – taking the pain out of major upgrades

If your computers are physical machines, where each piece of hardware runs a single OS image, then upgrading that OS image puts your services at risk or makes them unavailable for a period of time.

Sure, you have a development and test environment, where you can prove the process, but those machines cost money.  So processes develop to either ensure you have a good backout, or you can make changes you know will work.

Virtual Machines have changed the game.  I have a couple of Linux (Debian) based VM’s.  They’re piddly little things that run some websites and a news server.  They’re basically vanity VM’s, I don’t need them.  I could get away with shared hosting, but I like having servers I can play with.  It keeps my UNIX skills sharp, and let’s me learn new skills.

Debian have just released v6 (Squeeze).  Debian’s release schedule is slow, but very controlled and it leads to hopefully, very stable servers.  Rather than constantly update packages like you might find with other Linux distributions, Debian restricts updates to security patches only, and then every few years a new major release is made.

This is excellent, but it does introduce a lot of change in one go when you move from one release of Debian to the next.  A lot of new features arrive, configuration files change in significant ways and you have to be careful with the upgrade process as a result.

For matrix (the VM that runs my news server), I took the plunge and ran through the upgrade.  It mostly worked fine, although services were out for a couple of hours.  I had to recompile some additional stuff not included in Debian, and had to learn a little bit about new options and features in some applications.  Because the service is down, you’re doing that kind of thing in a reasonably pressured environment.  But in the end, the upgrade was a success.

However, the tidy neat freak inside me knows that spread over that server are config files missing default options, or old copies of config files lying round I need to clean up; legacy stuff that is supported but depreciated sitting around just waiting to bite me in obscure ways later on.

So I decided to take a different approach with yoda (the server that runs most of the websites).  I don’t need any additional hardware to run another server, it’s a VM.  Gandi can provision one in about 8 minutes.  So, I ordered a new clean Debian 6 VM.  I set about installing the packages I needed, and making the config changes to support my web sites.

All told, that took about 4 hours.  That’s still less time than the effort required to do an upgrade.

I structure the data on the web server in such a way that it’s easy to migrate (after lessons learned moving from Gradwell to 1and1 and then finally to Gandi), so I can migrate an entire website from one server to another in about 5 minutes, plus the time it takes for the DNS changes to propagate.

Now I have a nice clean server, running a fresh copy of Debian Squeeze without any of the confusion or trouble that can come from upgrades.  I can migrate services across at my leisure, in a controlled way, and learn anything I need to about new features as I go (for example, I’ve switched away from Apache’s worker MPM and back to the prefork MPM).

Once the migration is done, I can shut down the old VM.  I only pay Gandi for the hours or days that I have the extra VM running.  There’s no risk to the services, if they fail on the new server I can just revert to providing them from the old.

Virtual Machines mean I don’t have to do upgrades in place, but equally I don’t have to have a lot of hardware assets knocking around just to support infrequent upgrades like this.

There are issues of course, one of the reasons I didn’t do this with matrix is that it has a lot of data present, and no trivial way to migrate it.  Additionally, other servers using matrix are likely to have cached IP details beyond the content of DNS, which makes it less easy to move to a new image.  But overall, I think the flexibility of VM’s certainly brings another aspect to major upgrades.

Cygwin and rsync and all things nice

I wrote a little while ago that I was running Linux (Ubuntu in this case) inside a VirtualBox virtual machine, and it was all good.  Before that I’ve played with lots of methods of getting my favourite unix utilities (like rsync) working under Windows.  I’ve used Cygwin, and pre-compiled Windows versions and stripped-down Cygwin versions, and second machines running Linux and VM’s.

One of the main drivers for getting those things working is to back up my websites, held on my hosting account.  I can ssh into my hosting account, and that means if I can get rsync going locally, I can use it with ssh to copy all changes to my local machine.  It’s efficient (rsync only copies changes) and it’s easy.  The pain is always finding a decent compliant version of rsync.

Anyway, I already said that when I started using the Linux VM I ported my script across to that, and along with the VirtualBox shared folders, I could backup my websites and they were visible under XP.  It wasn’t pretty but it worked, and it meant I had to start up the VM.  At the start that wasn’t a problem because I was using it quite a bit but as the days went on and I stopped launching it, backups were less frequent.

And then today – random disaster.  I crashed the VirtualBox VM image, and after a couple of restarts it eventually stopped booting.  This wasn’t a great problem as I had snapshots of working images, so I just rolled back to one of those with two clicks.  Two clicks which took less time than the following thought took to get from one end of my brain to the other ‘I made the snapshots weeks ago, and since then I’ve written a lot of scripts and downloaded a lot of files and you just erased them all you idiot’.

So, I set about repatching Ubuntu and setting up various settings that I’d lost and made a few more snapshots.  But I needed a more permanent, reliable website backup solution.

Which means I’ve installed Cygwin again.  I know there are Windows binaries for rsync, and I know there are other apps which claim to do the same thing, but you can’t (in my view) beat the simplicity of Cygwin and the unix binaries.   Now I have a working cron daemon, ssh configured, rsync installed, and my little script which does all the work.  The rsync command is pretty simple,

rsync –recursive –links –safe-links –rsh=ssh –stats –human-readable me@mywebhost:/myhomedir/ /path/to/local/copy/

Then I just tar up the resulting files, compress them, make sure the filename has a date in it, and I can be confident I’ve got copies of everything I need.  Since most of my sites rely on mysql for their data, I also run some jobs on my webhost to mysqldump all the data into files three times a week, and I then back those files up locally.  I could mysqldump the content remotely, but it’s a hell of a lot quicker to do it on their system, compress them, and then rsync the compressed files.

Installing ssmtp lets me send mail from the Cygwin command line, so the script can send me a mail when it’s finished, and I’ll schedule it in cron to run once a week or something.  Much better.

Plus, I get all the fun of vi, grep and awk 🙂