Draft plan: online backup with a local buffer

Photo by godog – found on Flickr tagged “backup” :)

These are just some draft plans to use a lightweight local backup device to support off-site backups for my mom and dad’s MS Windows machines. I think it solves some typical problems with “online backup” (limited bandwidth), which is otherwise very cost-effective if you only have a few GBs to store (like mom and dad…).

But let’s start with some background blabla. Eons ago I said I was going to review Jungle Disk (a backup utility that stores your data at Amazon Simple Storage Service (S3)). It didn’t happen, but not because I forgot. I have been keeping a draft around, but I felt I hadn’t actually spent enough time with the tool to make any fair comments about it. By now Jungle Disk have moved on to version 2.0 of their software – so my notes have passed their expiry date :)

Learning the hard way

Let me summarise what happened: I installed the Jungle Disk utility on my mom’s Windows machine, then when I visited about a month later I looked at the results: much to my surprise, not a single backup session had finished succesfully. And my mom uses her laptop every day.

It wasn’t Jungle Disk’s fault – in fact, Jungle Disk had done just what should be expected from it: it wrote a log entry for every backup that was incomplete. Looking through the logs, I found three problems:

  • Despite the use of an incremental backup scheme, daily backup volumes were very large. This was mainly due to the email-client data. The client stores all email in a single (huge) file, which thus changes every time you receive email. Simple incremental schemes just send all changed files, and so effectively your whole email archive ends up in the daily increments.
  • Combining the large backup volumes with a “slow” 256kbit/s ADSL upload rate was even more problematic. My mom didn’t have her laptop online long enough per day for the backups to finish.
  • The version of Jungle Disk used didn’t support resuming backup sessions after the laptop had been in stand-by mode. Thus, the unfinished backups were really just lost efforts. According to the release notes, the present version is capable of resuming backup sessions – a welcome improvement.

Essentially, I learned then that there’s no such thing as unattended backup. Don’t pay attention for a few days and you might just find a long list of failed sessions in the logs. So now I’m looking for a backup solution that provides the following:

  • Fast backups. If, say, you have to leave and catch a train, you want to be able to hit “backup” quickly without having to wait for the data to drip through the ADSL bottleneck.
  • Off-site backups. That’s what I thought was cool about Amazon S3. It used to be a feature that wasn’t for home users, but at Amazon S3 prices it’s suddenly become affordable (unless you have a serious amount of data to store).
  • Sending only the changed portions of files (a la rdiff/rsync). In the end, sending data costs time and money, so let’s minimise it. Alternatively I could find an email client that supports storing every email in a file of its own, but that’s a user-bullying alternative…
  • Freedom: as I mentioned before I’d prefer to use e.g. duplicity, but I was worried about the surprises I might have when running that in MS Windows.
  • Remote manageability: having let go of the illusion of unattended backup now, I would like to be able to check on the backup system and perhaps do some remote troubleshooting.

Up for take two

The solution (I hope) is in a storage device named NSLU2. At about 55GBP or 65Euro it’s inexpensive (it has to be, I won’t spend several hundred Euros on a serious NAS just to back up 10GB or so of data) and it’s good (unlike the typical cheap dodgy NASes). And it runs Debian.

So the little NSLU2 with an old and smallish hard disk will sit next to the home router, always ready for the Windows machines to connect. Backing up data over the local network is fast (I suppose we’ll just have samba file sharing), and subsequent off-site backups no longer involve the Windows machines – so it’s fine if they take all day. Power consumption should be minimal.

I suppose it shouldn’t be a problem to use duplicity on the little Debian box – I’m however not aware of its hardware requirements (maybe the encryption part will take forever with the XScale CPU?). Duplicity conveniently supports the Amazon S3 API, uses a standard and open file format, and does the rdiff magic, too.

I’m still asking myself if this is “overkill” for the task at hand, but actually I can hardly wait to order an NSLU2… wait, did you just point out it can also serve as a print server? Of course!… Maybe the decision has just been made…


8 Responses to “Draft plan: online backup with a local buffer”

  1. 1 Solomon 26 July 2008 at 4:53

    I like the picture of the soft papers. Hey, cover your anatomy!

    By the way, there is a very informative site for online backup:


  2. 2 yungchin 26 July 2008 at 9:22

    My anatomy?
    Thanks for the link, looks like that’s going to be a useful resource once they have a sufficient body of reviews.

  3. 3 wzzrd 28 July 2008 at 15:04

    I heard a(nother) buddy talk about giving his parents a big fat hard drive so [i]he[/i] could use it to do offsite backups on. Hehe.

  4. 4 Jason Patterson 29 July 2008 at 4:56

    Block level backups can be used to overcome the problem of large, frequently changing email files. Basically, the file is cut up into small blocks, and only the ones that have changed get sent up to the server.

    I’m not sure if Jungle Disk supports this functionality, but I use Angel Backup – http://www.angelbackup.com – as my online backup service, and they have no problem with large outlook files.

  5. 5 yungchin 29 July 2008 at 10:59

    wzzrd: :P Not a bad plan. I see an opportunity to learn more about OpenVPN…

    Jason: thanks for the link, I didn’t know about Angel Backup. But then you also get that kind of thing with rdiff-backup or duplicity – both of which tick the “freedom” box….

  6. 6 yungchin 10 February 2009 at 13:42

    It seems quite some search traffic is going through here looking for “duplicity on windows” – I just found a more interesting destination for y’all: Using duplicity on windows under cygwin.

  1. 1 Recent Links Tagged With "amazons3" - JabberTags Trackback on 9 October 2008 at 5:49
  2. 2 Bookmarks about Nslu2 Trackback on 26 December 2008 at 6:30
Comments are currently closed.

%d bloggers like this: