Duplicity  packages one or more directories into a tar archive, encrypts the results with GnuPG, and automatically uploads the backup created in this way to a backup server. Signatures help reveal tampering or disk failure, which means backups can be stored on insecure servers or in the cloud. Duplicity even offers native functions for talking to some well-known cloud services.
Additionally, Duplicity can create incremental backups, in which the transferred archive contains only the delta to the previously created backup. This not only saves disk space on the server but also means individual backups are created faster. Duplicity is licensed under the GNU GPL and thus can be used free of charge.
Duplicity is tailored for Linux and other Unix operating systems, such as BSD or OS X. Most major Linux distributions have it in their repositories. Users of OS X can install it via Fink, for example. For Ubuntu-based distributions, there is also a PPA  with the current Duplicity version. Alternatively, Duplicity can be built quickly from the source code (see the box "Self-Build"). Duplicity works on Windows in the Cygwin environment but is unable to handle the specific features of the Windows filesystem. Administrators should back up Windows systems with some other software if possible.
Duplicity is very simple to operate. At the command line, you pass in the directory to be backed up and the storage directory to the tool. The following example packages the complete
/etc directory in a tar archive, encrypts it, and uses secure copy (
scp) to store the results on the server at example.com below a directory named
/var/backup (Figure 1). Note the double slashes after the domain name:
duplicity --progress /etc scp://firstname.lastname@example.org//var/backup
During the backup, Duplicity considers deleted files, all file permissions, subdirectories, FIFOs, device files, and symbolic links, but not hard links. Specifying the
--progress parameter tells Duplicity to indicate the progress continuously. Note that the tool always expects parameters in front of the directory information. Furthermore, you must ensure that Duplicity has the correct permissions. In the above case, it must therefore be allowed to access
/etc and all its contents.
Duplicity automatically compresses the archive with
gzip, which can be switched off with the
--no-compression option. Additionally, Duplicity creates some temporary files in the appropriate directory – for Linux, this is usually in
/tmp. If you have insufficient free space, you can use
--tempdir /<path/to>/tmp to define another directory. In previous Duplicity versions, users had to define the temporary directory in the environment variable
TMPDIR. The developers have made this method obsolete, however.
Duplicity encrypts the resulting archive with GnuPG. For this reason, you need to create and type a password (the GnuPG key) after calling Duplicity; you will need the password to restore the backup later. Accordingly, you will want to make the password as long and cryptic as possible – but not so long that you forget it; otherwise, you can say goodbye to your data.
Transferring Passwords for SSH and FTP
The previous command assumes that you log in to the SSH server using private and public keys. If you want to authenticate via password, specify the Duplicity
--ssh-askpass parameter. The tool then prompts you for the required SSH password when connecting. If the SSH server is not listening on the default port, you also need to specify the port in the usual way separated by colons after the domain name:
duplicity /etc scp://email@example.com:2222//var/backup
If you want to store the backup on an FTP server, you need to enter the password for the server in the
FTP_PASSWORD environment variable. In the following example, it is
123. For the FTP transmission method, the domain name is followed by a slash:
FTP_PASSWORD=123 duplicity /home/tim ftp://firstname.lastname@example.org/var/backup
Incidentally, Duplicity also evaluates the
FTP_PASSWORD environment variable for an SSH connection. You can thus omit the
--ssh-askpass omit parameter and define the SSH password in the
FTP_PASSWORD environment variable. This is especially useful if you want to include Duplicity in a script. If you want Duplicity to create the backup archive on a local storage medium, use the
duplicity /etc file:///mnt/backup
Duplicity can transmit the backup archive with many other protocols such as Rsync and WebDAV. Additionally, Duplicity can store the backups in various cloud services, including Dropbox, Azure, OpenStack Swift, and Amazon S3, along with a couple of quirky storage memory options such as sending email. Almost every new release of Duplicity adds new protocols. For a complete and quite long list, check the Duplicity man page. To access this, type
man duplicity and look for the "URL format" section. The man page for the current Duplicity version is also available online .
For some protocols and services, Duplicity requires additional libraries and tools (see the box "Modules Used"). The backup program prompts for any missing helpers when called. On Linux, there is no need for manual attention for the standard protocols, but this is not true for many cloud services. For example, to access Amazon S3, you need Boto  software version 2.0 or newer. For a complete list of all dependencies for all supported services, see the Duplicity man page "Requirements" section.
Storing Encrypted Backups
Duplicity uses symmetric encryption by default – that is, the same password is used to encrypt and decrypted the backup. Alternatively, the tool can use GnuPG public key encryption. Here, each user has two keys: An archive locked with the public key can only be unlocked again with the private key.
If you want to use a new key pair for the backup, create the key before the first backup using
gpg --gen-key. To do this, answer the questions posed; if in doubt, leave the fields blank or accept the default settings by pressing Enter (Figure 2). You will need to type the passphrase each time for encryption and decryption. At the end, GPG outputs a key ID, which you will want to remember.
Because the keypair secures your backup, you will want to save it on an external medium. Use the following two commands to create a copy of the public and private keys in the files
gpg --output /mnt/key_pub.gpg --armor --export Key-ID gpg --output /mnt/key_sec.gpg --armor --export-secret-keys Key-ID
On another system, or after system recovery, the key can then be reloaded using
gpg --import. When creating a new backup, you need to tell Duplicity the key ID of the public key using the
duplicity --ssh-askpass --encrypt-key 12345678 /etc scp://email@example.com//var/backup
Normally, you need to type the passphrase after calling the command. If Duplicity runs directly, the GPG agent has probably stored the passphrase in the background. Furthermore, Duplicity buffers some metadata in the
~/.cache/duplicity/ directory that it retrieves whenever called.
When you call Duplicity from a script, you can enter the passphrase in the
PASSPHRASE environment variable. This method, however, poses a security risk: Anyone who can read the script automatically discovers the passphrase. If you set the environment variable in a script, you should at least explicitly dump its contents from memory afterward using
unset. You can completely disable encryption with
When first called, Duplicity always completely backs up the source directory. Once you invoke the command a second time, Duplicity only backs up the previously added or changed data (delta). This approach has the advantage that you can include the Duplicity call in a cron job or startup script, thus ensuring that Duplicity runs regularly and automatically. To do this, Duplicity uses the librsync library, which implements the well-known Rsync algorithm.
Incremental backups save space on the server and can be created much faster. However, if a read error occurs in one of the parts, the subsequent backups will very likely be useless. Moreover, recovery will take longer because Duplicity may first need to look at all the incremental backups. For this reason, you should perform a full backup at regular intervals. You can enforce this by specifying
duplicity full /home/tim scp://firstname.lastname@example.org//var/backup
In this case,
full is not a parameter but an action that needs to follow the program name directly. The
--full-if-older-than parameter tells Duplicity to create full backup if the last full backup was created more than a predetermined period ago (Figure 3) – in this example more than one month:
duplicity --full-if-older-than 1M /home/tim scp://email@example.com//var/backup
You need to leave out the
full action in this case; otherwise, it would overrule the
1M for a month, you can also specify other periods; for example,
14D is 14 days. The appropriate value depends on your organization's backup strategy.
Duplicity does not pack the data to be backed up into a single huge archive; instead, it distributes the data to several smaller archives. Because these volumes can only grow to a maximum of 25MB by default, numerous small files accumulate over time on the server (Figure 4).
You can change this behavior using the
--volsize parameter, which lets you define the maximum size of each volume in megabytes. For example,
--volsize 125 increases the size to 125MB. As the volume size increases, however, Duplicity also needs more RAM. You might want to exercise caution when increasing this value.
Including and Excluding Data
--exclude parameter lets you specifically leave out a subdirectory from the backup. In the following example, the tool would not back up the subdirectory
duplicity --exclude /home/klaus/Videos /home scp://firstname.lastname@example.org//var/backup
If you want to back up the entire system via the root directory (
/), you should at least always exclude
/proc, the dynamic filesystem that provides a window into the running kernel. Otherwise, you are in danger of Duplicity tripping up all over its content. For each directory to exclude, you must specify the
--exclude parameter again. The
--include parameter lets you specifically include certain subdirectories. This example command
duplicity --include /home --include /etc --exclude / / scp://email@example.com//var/backup
exclusively backs up the
You can restore a backup by reversing the source and destination calls. The following example restores the backup stored in
/var/backup on the server example.com to the
duplicity scp://firstname.lastname@example.org//var/backup /home/tim/restore
On request, Duplicity even restores a single file. The parameter responsible for this,
--<file-to-restore>, expects the relative path to the file in which you are interested. For example, if you are backing up the
/home/klaus directory, you can restore the
letter.txt file originally stored in
/home/klaus/Documents with the following command:
duplicity --<file-to-restore> Documents/letter.txt scp://email@example.com//var/backup letter_alt.txt
At the end of the call, Duplicity does not expect the directory in which to restore the file but rather a file name. In the preceding example, the tool retrieves the file
letter.txt from the backup and stores it in the current directory as
list-current-files action lists all the files in a backup:
duplicity list-current-files scp://firstname.lastname@example.org//var/backup
--time parameter, you can even revert to a certain file version. The following example retrieves exactly the version of the
letter.txt file that was stored in the backup seven days earlier. This assumes that Duplicity created a backup seven days ago:
duplicity --time 7D --<file-to-restore> Document/letter.txt scp://email@example.com//var/backup letter_alt.txt
Alternatively, you can also specify a specific date; for example,
--time 2015/9/10/ accesses the backup from September 10, 2015.
Ensuring Data Integrity
verify action compares the data on the hard drive with the last backup. This tells you which data had changed and whether an unauthorized third party has changed the backup:
duplicity verify --compare-data --ssh-askpass scp://firstname.lastname@example.org//var/backup /home/tim
--compare-data parameter ensures that Duplicity also compares the contents of the files (Figure 5). This takes longer, but without specifying the option, Duplicity 0.7 did not always deliver reliable information in my tests. You can write the report generated by
verify to the
duplog.txt text file using the
--log-file duplog.txt parameter. In this way, you can include the check in a cron job and then selectively analyze the logfile with a monitoring tool or have it sent via email.
On closer inspection, Duplicity turns out to be a full-featured backup tool. Although you can quickly and easily back up a single directory, more complex situations require more parameters, which can lead to a very complex command line. The documentation is also restricted to the man page, although it is extremely detailed. On a positive note, Duplicity can be integrated into your own shell scripts. Thanks to automatic encryption, you can also store backups in the cloud, even if this is sensitive data for which privacy policies need to be observed.