Snapback2 How-To

Not too long ago I created my own automated backup script. Shortly afterwards, helpful people sent me links to other, more robust scripts that have been written. One of those was called Snapback2.

Snapback2 is a backup script based on rsync and hard-links. I could explain what that means, but why reinvent an already well invented wheel? Again?

My original script was alright, but it worked by making an exact copy of everything each time it ran. For my 4 GB home directory, backing it up weekly over the course of a month would result in a backup directory that is 16-20 GB in size! That’s a lot of wasted space, especially when some files don’t change at all.

Snapback2 uses hard links and only stores changes between one backup to the next, which means that if I only changed files that were 30MB in size, then the next backup will be 30MB as well. If no changes were made, then no space is wasted at all. Clearly this method is superior to what I have written.

Setting up Snapback2 is supposed to be very simple, but I found that the documentation assumes you know what you’re doing. The following is my Snapback2 How-To:

You can download Snapback2 at http://search.cpan.org/~mikeh/Snapback2-0.5/ for the latest version as of this writing. Technically you should be able to use Perl to download it from CPAN, but I didn’t. Most of the prerequisites should be on your Linux-based system already. According to the documentation, you’ll need:

Gnu toolset, including cp, rm, and mv
rsync 2.5.7 or higher
ssh
Perl 5.8 or higher
Perl module Config::ApacheFormat

On my Debian Sarge system, I have rsync 2.6.4, so your distribution will likely have at least 2.5.7. Similarly, I have Perl 5.8.4. The one thing that you need to do is get and install Config::ApacheFormat. To do so, make sure you have root privileges and run:

# perl -MCPAN -e 'install Config::ApacheFormat'

If it is the first time you’ve used CPAN through Perl, you will be prompted to configure it. If you aren’t sure, you can simply cancel the configuration step and it apparently grabs some defaults just fine. Any and all dependencies will also be installed.

Once you have all of the prerequisites, you can install Snapback2. Again, you could probably do the same thing above to grab it from CPAN, and it will probably grab Config::ApacheFormat for you, but as I didn’t do that, I won’t cover it here.

If you grabbed the tar.gz file from the link I provided above, you should run the following:

# tar xzf Snapback2-0.5.tar.gz

It will create a directory called Snapback2-0.5. The README tells you what to do, but for completeness, here are the next steps:

# cd Snapback2-0.5
# perl Makefile.PL
# make
# make test
# make install

Snapback2 should now be installed on your system. If it isn’t, you should double-check that you have all the prerequisites. The fourth line in the previous list runs tests before installing. If something failed, you should know why from the test results. Even if you did install it successfully, it isn’t going to do anything yet. You now need to make a configuration file.

You can read the documentation setting up the configuration file in the man page for snapback2, but you can also view it online.

Here is my file, snapback.conf:

Hourlies 4
Dailies 7
Weeklies 4
Monthlies 12
AutoTime Yes

AdminEmail gberardi

LogFile /var/log/snapback.log
ChargeFile /var/log/snapback.charges

Exclude core.*

SnapbackRoot /etc/snapback

DestinationList /home/gberardi/LauraGB

<Backup 192.168.2.41>
Directory /home
</Backup>

I didn’t change it much from what was in Snapback2-0.5/examples. I installed Snapback2 on the machine called MariaGB. MariaGB will connect to 192.168.2.41, which is the IP address of my main machine called LauraGB. This is why the DestinationList refers to LauraGB. If I wanted to backup another system, say BobGB, I would keep those backups separate in their own directory called BobGB. Normally, the ssh/rsync request would ask for a password. When I setup the backups to run automatically, it won’t be useful to me if I need to be present to login. You can do the following to create a secure public/private key pair:

$ ssh-keygen -t rsa

The above line will create keys based on RSA encryption, although you could alternatively use DSA. You will be prompted for a passphrase, which is optional. Still, a good passphrase is much better than no phrase at all. Using the defaults, you should now have two files in your .ssh directory: id_rsa and id_rsa.pub. The first one is your private key. DO NOT give it to anyone. The second one is your public key, which you could give to anyone. When setting up key-based authentication, you will append the contents of this file to the server’s .ssh/authorized_keys file. Next time you login, instead of being prompted for a password, you will find yourself at a prompt, ready to work. For more detailed information, read this document about using key-based authentication with SSH.

So now, if I run the following command on MariaGB:

# snapback2

It will backup any changes from LauraGB’s /home directory to MariaGB. If, however, it hasn’t been an hour since the last backup, it won’t do anything.

Still, manually running this command isn’t very useful, and while I could install a cron job to run snapback2, I will instead make sure that snapback_loop is running. It acts as a daemon, checking to see if a file gets created in /tmp/backups. Now I can create the following entry in my crontab:

# Create file for snapback_loop to run
0,30 * * * * touch /tmp/backups/snapback

So now, every 30 minutes, I create the file /tmp/backups/snapback, which snapback_loop will take as its cue to delete that file and run snapback2. Then snapback2 will make a backup if there has been enough time since the last backup was made.

Now, I have automated backups that run regularly. Some caveats:

  • Verify that snapback2 is in /usr/local/bin. On my system, snapback2 would run manually, but snapback_loop would output errors to /tmp/backups/errors that weren’t too clear. I had to create a symlink to /usr/bin/snapback2 in /usr/local/bin in order to get it to run.
  • Make sure snapback_loop is running with root privileges. It has to call snapback2, which will need access to files in /var/log and other directories which will have restricted access. If you run it as a regular user, you may get errors. You could also change the location of the log file, but /var/log is a standard spot to keep such output.
  • Because you are running it with root privileges, you’ll need to make sure root is the one with the public key in authorized keys rather than your user account. Otherwise, you’ll get errors like “permission denied” when rsync tries to connect to the other machine.

If you don’t use a second computer, you can always use a second hard drive instead. Either way, you now have an effortless system for automating your backups!

11 comments to Snapback2 How-To

  • If you have a second I have a quick question for you. I am using a single machine with two drives. I ask myself, “Why use the network?” Let’s assume I’ve already evaluated the risks of backing up to the second drive rather than remote location or another box. These are my config attempts to get Snapback2 to not use ssh to go from one drive to the next across the network, while requiring authentication.

    <backup 127.0.0.1>
    Directory /myTest/
    </backup>

    <backup mainDir>
    Directory /myTest/
    </backup>

    <backup /mainDir/>
    Directory /myTest/
    </backup>

    <backup computerName>
    Directory /myTest/
    </backup>

    Nothing seems to go only local. I’m looking for the magic syntax and have googled the heck out of any related keywords. I came across your tutorial and would love to find you with this knowledge. Thanks for any clues.

    Scott

  • Hello, Scott! I think the problem is that you don’t tell Snapback2 how often you want backups, nor do you tell it the destination directories.
    You probably need something like the following:

    Hourlies 4
    Dailies 7
    Weeklies 4
    Monthlies 12
    AutoTime Yes

    SnapbackRoot /etc/snapback

    DestinationList /mnt/Backup/Directory/on/second/drive

    Then again, maybe I misunderstood and you already have that? If so, then “backup localhost” or “backup 127.0.0.1” would probably be fine. I don’t believe it is case sensitive, so backup and Backup should be interchangeable. Also, do you get errors? What do you mean that nothing seems to go only local?

    I am actually considering making a redundant solution by using the second drive in my main machine as well as the drive on the second machine. I may post a follow-up once I accomplish that.

  • GOT IT!

    That wasn’t the issue, but not your fault. I didn’t post my whole file. It doesn’t make this clear in the Snapback2 Documentation but you need to tell rsync that you don’t want to use ssh by identifying a server. You want rsync to just look to a directory for the source. Add this line to your config file to accomplish this:

    <code>
    ## set to not use ssh since second drive in same box
    ##————————————————

    RsyncShell none
    </code>

    Pass the word on this because I had to dig it out of the Snapback2.pm file to find the ‘if’ statement where this process was decided. So, either I am too new to have assumed something I should have or they have a hole in their documentation. Feel free to let me know which. For reference, here is my whole config file:

    <code>

    ## set how many of each increment to save
    ##————————————————

    Hourlies 4
    Dailies 7
    Weeklies 4
    Monthlies 12

    ## set … not sure what this does
    ##————————————————

    AutoTime Yes

    ## set the admin email to send confirmation to
    ##————————————————

    #AdminEmail office@mine.com

    ## set the path to the log files
    ##————————————————

    LogFile /Users/me/Desktop/snapback.log
    ChargeFile /Users/me/Desktop /snapback.charges

    ## set to not use ssh since second drive in same box
    ##————————————————

    RsyncShell none

    ## set the directories to backup to.
    # this will alternate backups between two directories/drives/locations
    ##————————————————

    DestinationList /Volumes/BACKUP/Backup1 /Volumes/BACKUP/Backup2

    ## set the path to the GNU cp, rm and mv commands
    ##————————————————

    Cp /opt/local/var/db/dports/software/coreutils/5.2.1_3/bin/gcp
    Rm /opt/local/var/db/dports/software/coreutils/5.2.1_3/bin/grm
    Mv /opt/local/var/db/dports/software/coreutils/5.2.1_3/bin/gmv

    ## set the backups
    ##————————————————

    <backup MyDocs>
    Directory /MyDocuments/
    </backup>
    </code>

    Notes:
    — I am pretty sure that ‘MyDocs’ can be whatever you want the backup directory named.
    — I am running Snapback2 on Mac OS X (Apple OS 10.3.8, 10.4, 10.x, Snapback installed on Mac OSX [SEO stuff]) so I had to install the GNU cp (GNU-cp) command for Snapback2 to utilize. I first installed <a href=”http://www.darwinports.org/”>DarwinPorts 1.0</a>. After that was installed I ported over the <a href=”http://darwinports.opendarwin.org/darwinports/dports/sysutils/coreutils/Portfile”>GNU Core Utilities</a>. After all that was in place and working I had to include the path to these tools in my config file for Snapback2. Note the lines beginning with Mv, Cp, and Rm that end in the gcp, gmv, and grm tool names. The installed tools are given the ‘g’ prefix by DarwinPort.

    I hope this helps you and others. I just stumbled across your site while googling the crap out of my topic. Thanks again.

    Scott

  • Hi, great documentation.
    I was wondering, is it possible to chnage the layout of the backup directory from:
    /backup/dir/$TIME
    to:
    /backup/$TIME/dir
    ?

  • You know, I am not sure. I want to say that it isn’t that configurable, but I could be completely wrong. You might want to ask the author of Snapback2. I’ve found him to be generally responsive.

  • ak

    i’ve done modification that do that:

    /backup/dir/$TIME
    to:
    /backup/$TIME/dir

    and some other adds… is it still needed?

  • AK, is what still needed? And are you saying that you got the different directory layout to work just fine?

  • ak

    I asked if the different directory layout is needed. Now modifyed script is being tested, and it’s OK for about one week (we hope to use it for backuping developers’ enviroments in our company).

  • Marcelo

    Hi,

    Trying to setup the script to backup a remote host that has a ssh port 2200 and not the default port 22.

    How can I change that port to a non default one?

    Thanks.

  • Marcelo, I’m not sure as I haven’t been using the script in a long time. Perhaps the original developer might be of some help?

  • Laurens

    This is still one of the better backup scripts! Its fully automatic, incremental and once set up, totally maintenance-free!
    The one thing which would be really great and user ‘ak’ seems to have implemented, would be the other directory structure /$TIME/$DIR. Any way at all to get back to him and ask how he did that?

    Cheers

Leave a Reply

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

  

  

  

Follow GBGames on Google Plus and Facebook!