this week I spent a little of my time to patch my backup script for AWS. Major change is a new plugin called remove_old_snapshots.

This script want to remove all snapshots of a specific volume. You can configure it in conf file (VOLUME_ID).

As you remember, that volume is used as destination of backups, so the circle is closed.

Ever in conf file, there’s a new variable N_BACKUP that contains the number of the days you want to keep snapshots saved. In my case, a server made a backup every day, so number of the days and number of backup is the same. The rule is: today-N_BACKUP is the date, before it snapshots are deleted.

yesterday, a server of mine on AWS died without any apparently problem. Simply, it ends to respond on any port: panic!!

What’s happen? Why a server online since mid 2009 went down? How recover any data? Backup or else? How would it takes?

To make it short: after three reboot from panel nothing changes, so nothing but create a new server was the solution.

After detach the ip and terminte the zombie server, I started a new server, attached disk, update the packages of linux release, remapped some paths…. and the server is up and running!!!

This is possible using EBS volumes that are persistent resources: mapping persistent folders on a EBS volume you can start and stop any server without loosing datas.

In my post AWS: a simple backup suite, I spent two words about what I mean with “persistent”. On AWS, if you use instance-store image (AMI), you know that root fs is not persisent across a shutdown, so you need to use a secondary disk (from EBS) to store any information you want to keep stored.

In that previous post I made an example with /srv but in reality I use parts of /etc, /var, /opt, and /home.

Unfortunatelly, you can forgot some folder or hope that a very stable system never goes down, but as this fact teachs, everything’ dies… servers too.

