Network backups with Amanda
Auntie Amanda
The Advanced Maryland Automatic Network Disk Archiver, also known as Amanda, backs up a computer across the network to a central backup server. The backups migrate to hard drives, network storage, optical media, or legacy tapes.
Amanda [1] was launched in 1991 by the University of Maryland's Department of Computer Science. Zmanda has handled development since 2007 and still hosts the Amanda forum today. Zmanda is now part of the Carbonite corporation, which has supported Amanda development since 2013. Amanda is available under the BSD license and the GPL, which means you can use it for free – even in a commercial environment. Support is available through Zmanda, along with the enhanced Amanda Enterprise edition, which offers a graphical user interface and other value-added perks. The Enterprise edition starts at $500 for the server and $300 for each client.
Backup via SSH
Amanda backs up all the systems over the network in parallel. Admins typically launch Amanda via a cron job. The data transfer uses encryption via OpenSSH on request. This support for encryption through OpenSSH means you can even back up systems in the DMZ without having to worry about eavesdropping. Current Amanda versions also support IPv6 connections and authentication via Kerberos 5.
Amanda copes with a large number of clients and can easily adapt to changing conditions. Before the backup, Amanda can launch a test program that performs a sanity check on all participating computers in parallel. If the test finds an error, it notifies the administrator by email. On request, Amanda encrypts all data on the client or the server via GPG or another encryption program. Amanda can even compress the archives with gzip
or any other compression program – either directly on the client or on the server. The backup tool relies on standard Linux tools for the backup, including the well-known tar
and dump
tools.
Once the data is backed up on the server, Amanda copies it to the target medium. The medium could be tape, a local hard disk, a NAS storage device, a SAN, or even a DVD. Thanks to caching on the server, Amanda can achieve a high working speed. If the backup does not fully fit on a destination medium, Amanda automatically distributes it to additional media. At the completion of the backup, Amanda sends a report to the backup administrator via email. Admins can even include their own scripts, which Amanda runs automatically before and after the backup.
Amanda determines the backup time and even dynamically adapts this time to match the network. The backup program also decides independently whether it is necessary to create a full backup or only to back up the changed files (incremental backup). Amanda always tries to optimize network bandwidth and available resources.
Prerequisites
Amanda can back up computers running Linux, Unix, or Mac OS, as well as Windows systems if you install a special client program. The Windows support extends to both 32- and 64-bit systems, but as of this writing, the developers have only tested the Windows client program up to Windows 7. The backup server must run Unix or Linux.
Amanda refers to the backup server as the "Backup Server Host." The systems you are backing up are "Backup Client Hosts." The backup server itself also can be a backup client. The backup server host must have at least one hard disk that can cache all backups that reach it. These "holding disks" are not a strict requirement, but without them, performance suffers because the server has to write the data directly to the target medium.
Amanda supports many tape systems [2] out the box. The backup software takes care of tape management and makes sure it does not accidentally overwrite the wrong tape. Amanda also logs which file is on which medium.
Installation
Zmanda.com also offers free Amanda packages, in addition to support for different distributions.
Most distributions provide Amanda packages in their repositories. Usually, you will find one package for the client and another for the server. In the case of Ubuntu 16.04 LTS, you install the amanda-server
package on the backup server and the amanda-client
on the client machines. The client tools are useful on the server, so you will also want to install them on the server.
Many distributions only offer somewhat old versions of Amanda. If you want to install on Ubuntu 16.04 LTS, for example, you get Amanda 3.3.6, which was released in mid-2014. However, if you install Amanda manually, you need to take care of regular updates yourself. If you want to use the latest stable version of Amanda, you will find pre-built binary packages for selected Linux distributions at Zmanda, although these Zmanda packages are all for older distributions. For example, users of Ubuntu 16.04 LTS will come away empty handed. If you can't find a package that supports your system, you need to turn to the source code archive on SourceForge [3]. On a Linux system, unpack the archive and then install Amanda with the well-known rule of three:
./configure make make install
This series of commands requires Make, GCC, and the developer packages for Glib (libglib2.0-dev
on Ubuntu systems).
On Windows machines, Zmanda offers a ready-to-install Amanda client, which is free of charge in the Community Edition. The client version was 3.3.6 when this issue went to press. To download the client, open the drop-down list on the Amanda download page, select version number 3.3.6, and download the matching ZIP archive at the bottom of the table. You need to unpack this archive on the Windows client and then run the setup.exe
installation program. To avoid problems, the same version of Amanda should run on all computers.
Many Files and Directories
Amanda runs with a user account specially set up for the purpose. Before you can assign a backup job to Amanda, you need to know this user's name. By default, the name will be amanda and the group will be backup. (Some distributions use different names. On Debian and Ubuntu, e.g., the user is named backup.) The examples in this article use amanda as the user name.
Amanda versions before 3.3.9 have a vulnerability that lets this user execute arbitrary code with root privileges. A (manual) upgrade to version 3.3.9 is only necessary if you don't trust the Amanda user.
Amanda can be difficult to set up because of its numerous settings and configuration files. This article uses the simple example of backing up the /etc
directory on the client machine client.example.com. The backup server is called server.example.com.
To launch Amanda, you need several directories on the server. First, make sure the directory /etc/amanda
exists there. This directory must be owned by the Amanda user, which is the case with Ubuntu by default. If you need to create the directory, enter:
$ mkdir -p /etc/amanda $ chown amanda /etc/amanda
If you want to store all the backups in the /amanda
subdirectory on the server, let Amanda know by editing several configuration files, which are best grouped in a separate subdirectory below etc/amanda
. The following example uses /etc/amanda/examples
.
One configuration could look after backing up the home directories, and another could handle the less frequently performed backup of /etc
. The name of the subdirectory is also the name of the configuration it holds. Both these directories must also belong to amanda
:
$ mkdir -p /amanda /etc/amanda/examples $ chown amanda /amanda /etc/amanda/examples
Amanda creates multiple files during the backup. In addition to the actual backups, you have, for example, the logs, all of which you could group below /amanda
. However, it is better to create a separate subdirectory in each case. The log data would end up in /amanda/logs
. Because these directories also need to belong to the amanda account, you might want to run the following command directly as the amanda
user:
$ sudo -u amanda mkdir -p /amanda/vtapes/slot{1,2,3,4} /amanda/holding /amanda/state/{curinfo,log,index}
The backups end up in the directories /amanda/vtapes/slot1
to /amanda/vtapes/slot4
later on; /amanda/holding
is the cache, /amanda/state/log
is a dump for the logfiles, and the other two directories hold data created by Amanda.
Configuring the Server
Once all the directories exist, it is time to create a configuration. Listing 1 shows a simple example inspired by a tutorial in the Zmanda wiki [4]. The settings from Listing 1 belong in the /etc/amanda/examples/amanda.conf
file. Like all other configuration files, Amanda needs access to them – ideally, you will want to create them directly while working as the amanda user.
Listing 1: amanda.conf Example
01 org "An example" 02 infofile "/amanda/state/curinfo" 03 indexdir "/amanda/state/index" 04 logdir "/amanda/state/log" 05 dumpuser "amanda" 06 labelstr "MyData[0-9][0-9]" 07 autolabel "MyData%%" EMPTY VOLUME_ERROR 08 tpchanger "chg-disk:/amanda/vtapes" 09 tapecycle 4 10 dumpcycle 3 days 11 amrecover_changer "changer" 12 tapetype "EXAMPLE-TAPE" 13 define tapetype EXAMPLE-TAPE { 14 length 100 mbytes 15 filemark 4 kbytes 16 } 17 define dumptype simple-gnutar-tcp { 18 auth "bsdtcp" 19 program "GNUTAR" 20 compress client fast 21 } 22 holdingdisk hd1 { 23 directory "/amanda/holding" 24 use 50 mbytes 25 chunksize 1 mbyte 26 }
The first line beginning with org
gives a description of the configuration. This description appears later in the subject line of any email notifications. Amanda stores its own data in infofile
; the index for all the backups is stored in the indexdir
directory, and all the logfiles are found in logdir
. Amanda also creates the backups with the user account stated after dumpuser
.
The following settings specify the location of the backups. Here is where you notice that Amanda was originally created for backups on tapes: Amanda refers to an archive containing the backup as the dumpfile, which is usually a simple TAR archive. Amanda saves the dumpfiles on volumes. A volume can be a tape, a DVD, or a subdirectory on the server.
Each volume stores a portion of the dumpfile and is given a label and a consecutive number. Based on the number, Amanda can reassemble the parts. Amanda automatically assigns the label based on the template stored below autolabel
(line 7). Amanda later replaces the %
with a consecutive number, which, in this case, is two digits. Based on the EMPTY
and VOLUME_ERROR
information, Amanda marks empty and incorrect volumes in the appropriate label. Generally the label must match the specifications in labelstr
; only then will Amanda use the volume. This requirement prevents a tape inserted by mistake from causing problems.
The long string after tpchanger
tells Amanda not to store the data on tapes but in the /amanda/vtapes
directory on the hard disk (chg-disk
). Amanda automatically distributes the backups across the subdirectories named slot1
, slot2
, and so on. From Amanda's perspective, each of the subdirectories is a separate (tape) drive or volume. tapecycle
sets the number of active volumes. In Listing 1, the number of volumes is four (slot1
to slot4
). Amanda only overwrites a volume if at least tapecycle
- 1 volumes have been created and written to.
The number of days after which Amanda creates a complete backup follows dumpcycle
. Thus, only incremental backups would be generated for three days in Listing 1. If you set the value to 0
, Amanda always creates a full backup. The next section sets the type and parameters of the tapes or the properties of the backup device. In Listing 1, each volume can hold 100MB of data. In the example, a maximum of 100MB are written to each of the slot1
to slot4
directories. If the backup is larger, Amanda distributes it across multiple volumes or tapes. Each type of tape is also given a unique name (in Listing 1, EXAMPLE-TAPE
).
The section of Listing 1 following define dumptype
(lines 17 to 21) determines how Amanda will create the backup. According to Listing 1, Amanda bundles all files with the tar program (see line 19 with program "GNUTAR"
) and compresses them on the client using a quick method (compress client fast
). Then, the data flows via a TCP connection to the server (auth "bsdtcp"
). Finally, these settings are saved as simple-gnutar-tcp
. Because multiple clients can require different backup procedures (tar, for instance, is not available on some computers), you need to define several dumptype
sections. The only important thing is that the names of the sections are different. A separate configuration file tells Amanda what dumptype
to use for which client.
Amanda uses one or several holding disks as intermediate storage during the backup. The holdingdisk
section (lines 22 to 26) specifies the key data for such a holding disk. In Listing 1, Amanda is allowed to cache data in the /amanda/holding
directory, where it can use a maximum of 50MB space; any file in the /amanda/holding
directory must not exceed a maximum of 1MB. You can add as many holdingdisk
sections as you need to add more holding disks. Each holding disk must be given a unique name; the one in Listing 1 goes by the not very original name hd1
. The Amanda developers recommend choosing the size of the holding disks to be larger than the backup of the largest partition of a client. If a terabyte partition weighs in at 500GB after compression, the holding disks need at least this amount of free space.
The amanda.conf
file (Listing 1) defines where Amanda creates and stores backups. What Amanda actually saves is revealed in a second configuration file, /etc/amanda/ADMINexample/disklist
. In this example, the disklist
file contains only one line:
client.example.com /etc simple-gnutar-tcp
This line tells Amanda to extract all the files from the /etc
subdirectory on the computer named client
. The example.com
file simply uses the tar command installed there (with the procedure defined in amanda.conf
as simple-gnutar-tcp
) to back up these files.
Establishing the Connection
In this example, the backup will use a TCP connection. For this connection to work, the amandad
daemon must be running on both the client and the server. The daemon on the client fields the request for the backup, creates the backup, and then sends it back to the server via TCP. Conversely, the daemon waits for the server to send restore requests from a client. Launching amandad
is the responsibility of the Internet super server, inetd
or xinetd
. If you have installed Amanda using your distribution's package manager, everything should be appropriately set up already. Otherwise, you just need to add the following additional line to both the client's and the server's /etc/inetd.conf
file:
amanda stream tcp nowait backup /usr/lib/amanda/amandad amandad -auth=bsdtcp amdump amindexd amidxtaped
If you use xinetd
, enter the settings from Listing 2 to the /etc/xinetd/amanda
configuration file. In any case, you now only need to ensure that inetd
or xinetd
is running in the background. For more information and configuration examples for setting up inetd
and xinetd
, see the man page for amanda-auth
.
Listing 2: Settings for xinetd
01 service amanda 02 { 03 disable = no 04 flags = Ipv4 05 socket_type = stream 06 protocol = tcp 07 wait = no 08 user = backup 09 group = disk 10 groups = yes 11 server = /usr/lib/amanda/amandad 12 server_args = -auth=bsdtcp amdump amindexd amidxtaped 13 }
Setting up Authorization
The amandad
daemon only accepts instructions from computers that you previously added to a list of trusted systems. This list is stored in the .amandahosts
text file, which is located in the home directory of Amanda user amanda
. On Ubuntu, the /var/backups/.amandahosts
file is only a symbolic link to the /etc/amanda-hosts
file. Other distributions use the same or a similar pattern.
First check to see whether .amandahosts
exists somewhere on the backup server. If not, create a new copy in the home directory of the amanda
user. Then, open .amandahosts
as the amanda
user with a text editor – on Ubuntu, type
sudo -u amanda vi /var/backups/.amandahosts
The content should now look like the following:
localhost amanda client.example.com root amindexd amidxtaped
The first line lets you restore backups on the server. The second line give the root user on the client machine access to the services needed for the restore: amindexd
and amidxtaped
. In the same way, working on the client machine, you need to allow access from the server. Open the appropriate .amandahosts
file. The contents should include the following two lines, the second of which gives the server access to the client:
localhost amanda server.example.com amanda
Checking the Backup
After creating all the configuration files, amcheck
checks for content problems and typos (Figure 1). You must call this program as the amanda
user, as in sudo -u amanda
:
$ sudo -u amanda amcheck ADMINExample
The actual backup runs the amdump
program, which – like amcheck
– simply expects the configuration name as a parameter, and which you also run as the amanda
user:
$ sudo -u amanda amdump ADMINExample
The tool does not output any information to the console. Only the return value tells you whether the backup was successful. The amreport
tool (Figure 2) provides a detailed report:
$ sudo -u amanda amreport ADMINExample
You can use Cron to activate Amanda at regular intervals. Add amdump
to your crontab
, along with its co-worker amcheck
, which will check the available disk space.
If necessary, amcheck
and amdump
send email to the administrator. amcheck
reports errors, and amdump
delivers a report. To send an email message, you only need to add the mailto
setting to the amanda.conf
configuration file:
mailto "admin-atat-example.com"
For amcheck
, you should also specify the -m
option:
amcheck -m ADMINExample
Connecting via SSH
Instead of the TCP connection, the backup server can log in to the client using SSH, use SSH to create a dump file, and finally copy it to the server. This series of steps removes the need to configure inetd
or xinetd
. In amanda.conf
, create a dumptype
section as follows:
define dumptype simple-gnutar-ssh { auth "ssh" ssh_keys "/etc/amanda/Example/ssh-key" client-username "amanda" compress none program "GNUTAR" }
Log on to the client using SSH with user name amanda
. (You need to authorize SSH: auth "ssh"
.) Amanda then creates a backup on the client using tar
and without compression (compress none
). This approach only works under the following conditions:
- The Amanda user
amanda
must be able to log in to the client and must therefore have a login shell. - Amanda cannot prompt for a passphrase. The login to the client must thus rely on either the SSH agent or a certificate exchange.
Doing without a passphrase can lead to security problems. For instructions on how to set up SSH appropriately, see the Zamanda wiki [5].
ssh_keys
then points to the file with the private key in the dumptype
section. Finally, Amanda also needs SSH support. If you install the backup program via your distribution's package manager, the default configuration should already support SSH. As an alternative to the SSH connection, you can use a use a VPN tunnel.
Performing a Recovery
You trigger the recovery on the client with amrecover
. The amrecover
tool retrieves the latest backup from the server. amrecover
also requires a small configuration file that mainly tells it the name of the backup server. On the client, working as the amanda
user, create the file /etc/amanda/ADMINExample/amanda-client.conf
with the contents from Listing 3. If necessary, you should create the required directory /etc/amanda
and assign the amanda
user.
Listing 3: amanda-client.conf
01 index_server "server.example.com" 02 tape_server "server.example.com" 03 tapedev "changer" 04 auth "bsdtcp"
The amrecover
program always restores the backup to the current working directory. To restore the user rights correctly, you need to launch the tool as the root user. As a parameter, amrecover
expects the name of the configuration:
amrecover ADMINExample
The tool connects to the backup server and then switches to a separate command line (Figure 3). First select the computer with the backup you want to restore with the following command:
sethost client.example.com
Now you can use setdisk
to switch to the last backup, where /etc
is the name defined in the disklist
file:
setdisk /etc
The ls
command lets you view the contents of the backup. You first need to select all the files and directories you want to retrieve. Define the files and directories using the add
command,
add hostname which tells Amanda to restore the <C>hostname<C> file.
After you select all the files and directories with add
, the extract
command restores them. Typing exit
quits the tool. Even if amrecover
is actually a client tool, you will typically use it to restore files on the backup server in production use, and then use scp
or rsync
to transfer the files to the clients.
Conclusions
Setting up Amanda can take several hours, especially in large heterogeneous environments. Other than articles like this one, the only sources of information for administrators are the official Amanda wiki and the countless man pages. When you read the documentation, you should always bear in mind that Amanda was originally designed for tape drives.
Once you have Amanda running, you can expect a reliable and proven partner for your backup and restore needs. Development work on Amanda has slowed recently, with new releases appearing only once a year. On the other hand, Amanda is very stable, and it offers very stable interfaces.