Nuts and Bolts Amanda Lead image: Lead Image © Tommaso Lizzul, 123RF.com
Lead Image © Tommaso Lizzul, 123RF.com
 

Network backups with Amanda

Auntie Amanda

The free Amanda backup utility dates back to the days of tape drives, but it is still a powerful tool for centralized backup across the network. By Tim Schürmann

The Advanced Maryland Automatic Network Disk Archiver, also known as Amanda, backs up a computer across the network to a central backup server. The backups migrate to hard drives, network storage, optical media, or legacy tapes.

Amanda [1] was launched in 1991 by the University of Maryland's Department of Computer Science. Zmanda has handled development since 2007 and still hosts the Amanda forum today. Zmanda is now part of the Carbonite corporation, which has supported Amanda development since 2013. Amanda is available under the BSD license and the GPL, which means you can use it for free – even in a commercial environment. Support is available through Zmanda, along with the enhanced Amanda Enterprise edition, which offers a graphical user interface and other value-added perks. The Enterprise edition starts at $500 for the server and $300 for each client.

Backup via SSH

Amanda backs up all the systems over the network in parallel. Admins typically launch Amanda via a cron job. The data transfer uses encryption via OpenSSH on request. This support for encryption through OpenSSH means you can even back up systems in the DMZ without having to worry about eavesdropping. Current Amanda versions also support IPv6 connections and authentication via Kerberos 5.

Amanda copes with a large number of clients and can easily adapt to changing conditions. Before the backup, Amanda can launch a test program that performs a sanity check on all participating computers in parallel. If the test finds an error, it notifies the administrator by email. On request, Amanda encrypts all data on the client or the server via GPG or another encryption program. Amanda can even compress the archives with gzip or any other compression program – either directly on the client or on the server. The backup tool relies on standard Linux tools for the backup, including the well-known tar and dump tools.

Once the data is backed up on the server, Amanda copies it to the target medium. The medium could be tape, a local hard disk, a NAS storage device, a SAN, or even a DVD. Thanks to caching on the server, Amanda can achieve a high working speed. If the backup does not fully fit on a destination medium, Amanda automatically distributes it to additional media. At the completion of the backup, Amanda sends a report to the backup administrator via email. Admins can even include their own scripts, which Amanda runs automatically before and after the backup.

Amanda determines the backup time and even dynamically adapts this time to match the network. The backup program also decides independently whether it is necessary to create a full backup or only to back up the changed files (incremental backup). Amanda always tries to optimize network bandwidth and available resources.

Prerequisites

Amanda can back up computers running Linux, Unix, or Mac OS, as well as Windows systems if you install a special client program. The Windows support extends to both 32- and 64-bit systems, but as of this writing, the developers have only tested the Windows client program up to Windows 7. The backup server must run Unix or Linux.

Amanda refers to the backup server as the "Backup Server Host." The systems you are backing up are "Backup Client Hosts." The backup server itself also can be a backup client. The backup server host must have at least one hard disk that can cache all backups that reach it. These "holding disks" are not a strict requirement, but without them, performance suffers because the server has to write the data directly to the target medium.

Amanda supports many tape systems [2] out the box. The backup software takes care of tape management and makes sure it does not accidentally overwrite the wrong tape. Amanda also logs which file is on which medium.

Installation

Zmanda.com also offers free Amanda packages, in addition to support for different distributions.

Most distributions provide Amanda packages in their repositories. Usually, you will find one package for the client and another for the server. In the case of Ubuntu 16.04 LTS, you install the amanda-server package on the backup server and the amanda-client on the client machines. The client tools are useful on the server, so you will also want to install them on the server.

Many distributions only offer somewhat old versions of Amanda. If you want to install on Ubuntu 16.04 LTS, for example, you get Amanda 3.3.6, which was released in mid-2014. However, if you install Amanda manually, you need to take care of regular updates yourself. If you want to use the latest stable version of Amanda, you will find pre-built binary packages for selected Linux distributions at Zmanda, although these Zmanda packages are all for older distributions. For example, users of Ubuntu 16.04 LTS will come away empty handed. If you can't find a package that supports your system, you need to turn to the source code archive on SourceForge [3]. On a Linux system, unpack the archive and then install Amanda with the well-known rule of three:

./configure
make
make install

This series of commands requires Make, GCC, and the developer packages for Glib (libglib2.0-dev on Ubuntu systems).

On Windows machines, Zmanda offers a ready-to-install Amanda client, which is free of charge in the Community Edition. The client version was 3.3.6 when this issue went to press. To download the client, open the drop-down list on the Amanda download page, select version number 3.3.6, and download the matching ZIP archive at the bottom of the table. You need to unpack this archive on the Windows client and then run the setup.exe installation program. To avoid problems, the same version of Amanda should run on all computers.

Many Files and Directories

Amanda runs with a user account specially set up for the purpose. Before you can assign a backup job to Amanda, you need to know this user's name. By default, the name will be amanda and the group will be backup. (Some distributions use different names. On Debian and Ubuntu, e.g., the user is named backup.) The examples in this article use amanda as the user name.

Amanda versions before 3.3.9 have a vulnerability that lets this user execute arbitrary code with root privileges. A (manual) upgrade to version 3.3.9 is only necessary if you don't trust the Amanda user.

Amanda can be difficult to set up because of its numerous settings and configuration files. This article uses the simple example of backing up the /etc directory on the client machine client.example.com. The backup server is called server.example.com.

To launch Amanda, you need several directories on the server. First, make sure the directory /etc/amanda exists there. This directory must be owned by the Amanda user, which is the case with Ubuntu by default. If you need to create the directory, enter:

$ mkdir -p /etc/amanda
$ chown amanda /etc/amanda

If you want to store all the backups in the /amanda subdirectory on the server, let Amanda know by editing several configuration files, which are best grouped in a separate subdirectory below etc/amanda. The following example uses /etc/amanda/examples.

One configuration could look after backing up the home directories, and another could handle the less frequently performed backup of /etc. The name of the subdirectory is also the name of the configuration it holds. Both these directories must also belong to amanda:

$ mkdir -p /amanda /etc/amanda/examples
$ chown amanda /amanda /etc/amanda/examples

Amanda creates multiple files during the backup. In addition to the actual backups, you have, for example, the logs, all of which you could group below /amanda. However, it is better to create a separate subdirectory in each case. The log data would end up in /amanda/logs. Because these directories also need to belong to the amanda account, you might want to run the following command directly as the amanda user:

$ sudo -u amanda mkdir -p /amanda/vtapes/slot{1,2,3,4} /amanda/holding /amanda/state/{curinfo,log,index}

The backups end up in the directories /amanda/vtapes/slot1 to /amanda/vtapes/slot4 later on; /amanda/holding is the cache, /amanda/state/log is a dump for the logfiles, and the other two directories hold data created by Amanda.

Configuring the Server

Once all the directories exist, it is time to create a configuration. Listing 1 shows a simple example inspired by a tutorial in the Zmanda wiki [4]. The settings from Listing 1 belong in the /etc/amanda/examples/amanda.conf file. Like all other configuration files, Amanda needs access to them – ideally, you will want to create them directly while working as the amanda user.

Listing 1: amanda.conf Example

01 org "An example"
02 infofile "/amanda/state/curinfo"
03 indexdir "/amanda/state/index"
04 logdir "/amanda/state/log"
05 dumpuser "amanda"
06 labelstr "MyData[0-9][0-9]"
07 autolabel "MyData%%" EMPTY VOLUME_ERROR
08 tpchanger "chg-disk:/amanda/vtapes"
09 tapecycle 4
10 dumpcycle 3 days
11 amrecover_changer "changer"
12 tapetype "EXAMPLE-TAPE"
13 define tapetype EXAMPLE-TAPE {
14     length 100 mbytes
15     filemark 4 kbytes
16 }
17 define dumptype simple-gnutar-tcp {
18     auth "bsdtcp"
19     program "GNUTAR"
20     compress client fast
21 }
22 holdingdisk hd1 {
23     directory "/amanda/holding"
24     use 50 mbytes
25     chunksize 1 mbyte
26 }

The first line beginning with org gives a description of the configuration. This description appears later in the subject line of any email notifications. Amanda stores its own data in infofile; the index for all the backups is stored in the indexdir directory, and all the logfiles are found in logdir. Amanda also creates the backups with the user account stated after dumpuser.

The following settings specify the location of the backups. Here is where you notice that Amanda was originally created for backups on tapes: Amanda refers to an archive containing the backup as the dumpfile, which is usually a simple TAR archive. Amanda saves the dumpfiles on volumes. A volume can be a tape, a DVD, or a subdirectory on the server.

Each volume stores a portion of the dumpfile and is given a label and a consecutive number. Based on the number, Amanda can reassemble the parts. Amanda automatically assigns the label based on the template stored below autolabel (line 7). Amanda later replaces the % with a consecutive number, which, in this case, is two digits. Based on the EMPTY and VOLUME_ERROR information, Amanda marks empty and incorrect volumes in the appropriate label. Generally the label must match the specifications in labelstr; only then will Amanda use the volume. This requirement prevents a tape inserted by mistake from causing problems.

The long string after tpchanger tells Amanda not to store the data on tapes but in the /amanda/vtapes directory on the hard disk (chg-disk). Amanda automatically distributes the backups across the subdirectories named slot1, slot2, and so on. From Amanda's perspective, each of the subdirectories is a separate (tape) drive or volume. tapecycle sets the number of active volumes. In Listing 1, the number of volumes is four (slot1 to slot4). Amanda only overwrites a volume if at least tapecycle - 1 volumes have been created and written to.

The number of days after which Amanda creates a complete backup follows dumpcycle. Thus, only incremental backups would be generated for three days in Listing 1. If you set the value to 0, Amanda always creates a full backup. The next section sets the type and parameters of the tapes or the properties of the backup device. In Listing 1, each volume can hold 100MB of data. In the example, a maximum of 100MB are written to each of the slot1 to slot4 directories. If the backup is larger, Amanda distributes it across multiple volumes or tapes. Each type of tape is also given a unique name (in Listing 1, EXAMPLE-TAPE).

The section of Listing 1 following define dumptype (lines 17 to 21) determines how Amanda will create the backup. According to Listing 1, Amanda bundles all files with the tar program (see line 19 with program "GNUTAR") and compresses them on the client using a quick method (compress client fast). Then, the data flows via a TCP connection to the server (auth "bsdtcp"). Finally, these settings are saved as simple-gnutar-tcp. Because multiple clients can require different backup procedures (tar, for instance, is not available on some computers), you need to define several dumptype sections. The only important thing is that the names of the sections are different. A separate configuration file tells Amanda what dumptype to use for which client.

Amanda uses one or several holding disks as intermediate storage during the backup. The holdingdisk section (lines 22 to 26) specifies the key data for such a holding disk. In Listing 1, Amanda is allowed to cache data in the /amanda/holding directory, where it can use a maximum of 50MB space; any file in the /amanda/holding directory must not exceed a maximum of 1MB. You can add as many holdingdisk sections as you need to add more holding disks. Each holding disk must be given a unique name; the one in Listing 1 goes by the not very original name hd1. The Amanda developers recommend choosing the size of the holding disks to be larger than the backup of the largest partition of a client. If a terabyte partition weighs in at 500GB after compression, the holding disks need at least this amount of free space.

The amanda.conf file (Listing 1) defines where Amanda creates and stores backups. What Amanda actually saves is revealed in a second configuration file, /etc/amanda/ADMINexample/disklist. In this example, the disklist file contains only one line:

client.example.com /etc simple-gnutar-tcp

This line tells Amanda to extract all the files from the /etc subdirectory on the computer named client. The example.com file simply uses the tar command installed there (with the procedure defined in amanda.conf as simple-gnutar-tcp) to back up these files.

Establishing the Connection

In this example, the backup will use a TCP connection. For this connection to work, the amandad daemon must be running on both the client and the server. The daemon on the client fields the request for the backup, creates the backup, and then sends it back to the server via TCP. Conversely, the daemon waits for the server to send restore requests from a client. Launching amandad is the responsibility of the Internet super server, inetd or xinetd. If you have installed Amanda using your distribution's package manager, everything should be appropriately set up already. Otherwise, you just need to add the following additional line to both the client's and the server's /etc/inetd.conf file:

amanda stream tcp nowait backup /usr/lib/amanda/amandad amandad -auth=bsdtcp amdump amindexd amidxtaped

If you use xinetd, enter the settings from Listing 2 to the /etc/xinetd/amanda configuration file. In any case, you now only need to ensure that inetd or xinetd is running in the background. For more information and configuration examples for setting up inetd and xinetd, see the man page for amanda-auth.

Listing 2: Settings for xinetd

01 service amanda
02 {
03              disable = no
04              flags = Ipv4
05              socket_type = stream
06              protocol = tcp
07              wait = no
08              user = backup
09              group = disk
10              groups = yes
11              server = /usr/lib/amanda/amandad
12              server_args = -auth=bsdtcp amdump amindexd amidxtaped
13 }

Setting up Authorization

The amandad daemon only accepts instructions from computers that you previously added to a list of trusted systems. This list is stored in the .amandahosts text file, which is located in the home directory of Amanda user amanda. On Ubuntu, the /var/backups/.amandahosts file is only a symbolic link to the /etc/amanda-hosts file. Other distributions use the same or a similar pattern.

First check to see whether .amandahosts exists somewhere on the backup server. If not, create a new copy in the home directory of the amanda user. Then, open .amandahosts as the amanda user with a text editor – on Ubuntu, type

sudo -u amanda vi /var/backups/.amandahosts

The content should now look like the following:

localhost amanda
client.example.com root amindexd amidxtaped

The first line lets you restore backups on the server. The second line give the root user on the client machine access to the services needed for the restore: amindexd and amidxtaped. In the same way, working on the client machine, you need to allow access from the server. Open the appropriate .amandahosts file. The contents should include the following two lines, the second of which gives the server access to the client:

localhost amanda
server.example.com amanda

Checking the Backup

After creating all the configuration files, amcheck checks for content problems and typos (Figure 1). You must call this program as the amanda user, as in sudo -u amanda:

The NOTEs are not errors, but only references to data generated during the first backup run. In this case, Amanda needs to back up the computer with the IP address 192.168.1.102.
Figure 1: The NOTEs are not errors, but only references to data generated during the first backup run. In this case, Amanda needs to back up the computer with the IP address 192.168.1.102.
$ sudo -u amanda amcheck ADMINExample

The actual backup runs the amdump program, which – like amcheck – simply expects the configuration name as a parameter, and which you also run as the amanda user:

$ sudo -u amanda amdump ADMINExample

The tool does not output any information to the console. Only the return value tells you whether the backup was successful. The amreport tool (Figure 2) provides a detailed report:

The backup of the /etc directory on the computer at 192.168.1.102 was successful on Ubuntu 16.04.
Figure 2: The backup of the /etc directory on the computer at 192.168.1.102 was successful on Ubuntu 16.04.
$ sudo -u amanda amreport ADMINExample

You can use Cron to activate Amanda at regular intervals. Add amdump to your crontab, along with its co-worker amcheck, which will check the available disk space.

If necessary, amcheck and amdump send email to the administrator. amcheck reports errors, and amdump delivers a report. To send an email message, you only need to add the mailto setting to the amanda.conf configuration file:

mailto "admin-atat-example.com"

For amcheck, you should also specify the -m option:

amcheck -m ADMINExample

Connecting via SSH

Instead of the TCP connection, the backup server can log in to the client using SSH, use SSH to create a dump file, and finally copy it to the server. This series of steps removes the need to configure inetd or xinetd. In amanda.conf, create a dumptype section as follows:

define dumptype simple-gnutar-ssh {
  auth "ssh"
  ssh_keys "/etc/amanda/Example/ssh-key"
  client-username "amanda"
  compress none
  program "GNUTAR"
}

Log on to the client using SSH with user name amanda. (You need to authorize SSH: auth "ssh".) Amanda then creates a backup on the client using tar and without compression (compress none). This approach only works under the following conditions:

Doing without a passphrase can lead to security problems. For instructions on how to set up SSH appropriately, see the Zamanda wiki [5].

ssh_keys then points to the file with the private key in the dumptype section. Finally, Amanda also needs SSH support. If you install the backup program via your distribution's package manager, the default configuration should already support SSH. As an alternative to the SSH connection, you can use a use a VPN tunnel.

Performing a Recovery

You trigger the recovery on the client with amrecover. The amrecover tool retrieves the latest backup from the server. amrecover also requires a small configuration file that mainly tells it the name of the backup server. On the client, working as the amanda user, create the file /etc/amanda/ADMINExample/amanda-client.conf with the contents from Listing 3. If necessary, you should create the required directory /etc/amanda and assign the amanda user.

Listing 3: amanda-client.conf

01 index_server "server.example.com"
02 tape_server "server.example.com"
03 tapedev "changer"
04 auth "bsdtcp"

The amrecover program always restores the backup to the current working directory. To restore the user rights correctly, you need to launch the tool as the root user. As a parameter, amrecover expects the name of the configuration:

amrecover ADMINExample

The tool connects to the backup server and then switches to a separate command line (Figure 3). First select the computer with the backup you want to restore with the following command:

sethost client.example.com
The amrecover tool restores a backup or parts of it. In this case, the restore uses the /tmp subdirectory as a precaution.
Figure 3: The amrecover tool restores a backup or parts of it. In this case, the restore uses the /tmp subdirectory as a precaution.

Now you can use setdisk to switch to the last backup, where /etc is the name defined in the disklist file:

setdisk /etc

The ls command lets you view the contents of the backup. You first need to select all the files and directories you want to retrieve. Define the files and directories using the add command,

add hostname
which tells Amanda to restore the <C>hostname<C> file.

After you select all the files and directories with add, the extract command restores them. Typing exit quits the tool. Even if amrecover is actually a client tool, you will typically use it to restore files on the backup server in production use, and then use scp or rsync to transfer the files to the clients.

Conclusions

Setting up Amanda can take several hours, especially in large heterogeneous environments. Other than articles like this one, the only sources of information for administrators are the official Amanda wiki and the countless man pages. When you read the documentation, you should always bear in mind that Amanda was originally designed for tape drives.

Once you have Amanda running, you can expect a reliable and proven partner for your backup and restore needs. Development work on Amanda has slowed recently, with new releases appearing only once a year. On the other hand, Amanda is very stable, and it offers very stable interfaces.