New features in the Bareos Bacula fork
Better Backups
For years, an open source version of Bacula has been a popular solution for managing "backup, recovery, and verification of computer data" [1] on a network of diverse computers, operating systems, and storage media. Using the client-server model, Bacula scales from single computers to enterprise installations of hundreds of entities.
The open source version of Bacula was first published in 2002 and quickly found support in the community. Recently, less and less work has been put into the free Bacula, and new commits into the public Git project now occur only once every few months, with the developers seemingly focusing on the commercial Bacula Enterprise Edition, which is not publicly developed.
In 2010, long-standing Bacula developer Marco van Wieringen thus started to maintain enhancements and code cleanups that either were not accepted or were only proposed for integration into the commercial version in a separate Git repository. From this seed grew the decision by some former members of the Bacula community to continue development of an independent fork named Bareos.
The first stable release was Bareos 12.4 in April 2013 (the version number stands for the year and the quarter of the feature freeze). The current beta is version 13.2. On September 25, 2013, at the Open Source Backup Conference, formerly known as the Bacula Conference [2], the Bareos project was introduced to an interested audience.
Before you start working with Bacula or Bareos or start planning a test installation, you should take a look how the tools function (Figure 1). The basic structure always consists of a control unit, the Backup Director, one or more Storage daemons, and the File daemons on the clients to be backed up.
The File daemons are responsible for backing up the data from the client or restoring the data on the client again. This daemon runs permanently on the clients and carries out the Director's instructions.
The Director is the controller: It contains all the logic and accounts for most of the settings. Its configuration file describes the following:
- The database configuration
- All client systems and how they are addressed
- Which files should be backed up (a FileSet)
- The plugin configuration
- The before and after jobs (i.e., programs that are started before or after a backup job, e.g., to start and stop services)
- The storage and media pool with its properties and retention times
- The backup schedules
- Addresses for messages
- Jobs and JobDefs (job defaults)
Defining storage, a FileSet, and a client is not enough. These components are brought together by jobs, which define what is where and when to back it up.
The retention period for the backup data is controlled by File Retention, Job Retention, and Volume Retention periods. It makes sense to use only Volume Retention to control the retention times, because if several retention options overlap, you might experience surprising effects.
Volume Retention is defined per pool. By defining several pools, you can also work with different retention periods, such as for different systems or different backup types (e.g., full, differential, or incremental). The specified periods are the minimum retention periods.
Improved Usability
One focus in Bareos's development is keeping the obstacles for newcomers as low as possible. Because newcomers are usually overwhelmed by configuration options, the Bareos project offers package repositories for popular Linux distributions and Windows [3]. For Windows, additional packages for the OPSI [4] software management solution are also offered. All versions are built automatically by the project's own instance of the Open Build Service (OBS) [5]. In comparison, Bacula.org offers only the source code, and Windows binaries are only available for cash.
On Linux, you just need to add the appropriate repository to install a Bareos server and then install the Bareos packages. Bareos supports three database back ends: MySQL, PostgreSQL, and SQLite. SQLite should only be used for test installations.
Most optimization effort in the future will flow into the PostgreSQL connection. To ensure that the desired back end really is installed, you need to select the packages bareos
and bareos-database-postgresql
(or bareos-database-mysql
, if you prefer).
The database must be installed separately; Bareos only contains dependencies on database clients. This makes it possible for the database to run on a computer other than the Bareos server itself.
Unlike Bacula, Bareos defines the database to be used in the configuration file. In Bacula, you must build a version specifically for the respective database.
When you first install Bareos, it populates the configuration files in the /etc/bareos
directory with meaningful values. After the installation, the admin needs to initialize the database and start the services (Listing 1).
Listing 1: Starting Services
su postgres -c /usr/lib/bareos/scripts/create_bareos_database su postgres -c /usr/lib/bareos/scripts/make_bareos_tables su postgres -c /usr/lib/bareos/scripts/grant_bareos_privileges service bareos-dir start service bareos-sd start service bareos-fd start
In the automatic configuration, the backup is to disk by default (in /var/lib/bareos/storage
). Bareos backs up to disk in exactly the same way as it backs up to a tape library. That is, files are created below /var/lib/bareos/storage
, each corresponding to a tape. The advantage of this method is that uniform rules apply and retention hold times are handled in the same way for tapes and disks. The maximum file size and the maximum number are defined in the Director daemon in the pool resource (i.e., the /etc/bareos/bareos-dir.conf
file).
To create a virtual tape, you need to start the bconsole
program, which welcomes you with an asterisk prompt. After running label
and assigning a name (in this example, file1
), press 2 for the defined File
pool (Listing 2). With status director
, you can view the next scheduled jobs (Listing 3).
Listing 2: Labeling the Virtual Tape
*label Automatically selected Storage: File Enter new Volume name: file1 Defined Pools: 1: Default 2: File 3: Scratch Select the Pool (1-3): 2 Connecting to Storage daemon File at bareos:9103 ... Sending label command for Volume "file1" Slot 0 ... 3000 OK label. VolBytes=186 Volume="file1" Device="FileStorage" (/var/lib/bareos/storage) Catalog record for Volume "file1", Slot 0 successfully created. Requesting to mount FileStorage ... 3001 OK mount requested. Device="FileStorage" (/var/lib/bareos/storage) *
Listing 3: Status Display
*status director Scheduled Jobs: Level Type Pri Scheduled Name Volume ===================================================== Incremental Backup 10 18-Jul-13 23:05 BackupClient1 file1 Full Backup 11 18-Jul-13 23:10 BackupCatalog file1 ...
The backups are set in the configuration file to 23:05 hours (BackupClient1
: filesystem) and 23:10 hours (BackupCatalog
: backup of the database itself) To perform a test backup, you can launch it with the run
command, specifying only which client you want to back up. The results are displayed by calling the status director
command (Listing 4).
Listing 4: Status Director
*status director ... Terminated Jobs: JobId Level Files Bytes Status Finished Name ===================================================== 1 Full 135 6.679 M OK 18-Jul-13 16:00 BackupClient1 2 Incr 0 0 OK 18-Jul-13 16:01 BackupClient1 ...
The status scheduler
command shows when jobs are scheduled, and status scheduler days = 365
does this for an entire year in advance.
Improvements
Except for the installation, a number of other improvements make life easier for the Bareos administrator: Anyone who has ever worked with Bacula configuration files will be glad that, with Bareos, almost everything is predefined with sensible default values. In contrast to Bacula, Bareos also supports presets for string values, which means no more worrying about entering the Pid Directory
and Working Directory
directives in the File daemon configuration on the client. Bareos sets meaningful values for the appropriate platform when it creates the packages.
On Windows systems, you can now easily back up not just one, but all connected drives (Windows Drive Discovery). Bacula only supports this in the commercial version. The Volume Shadow Copy Service (VSS) call now discovers Windows drives automatically.
The use of tape libraries has been simplified. Tapes can now be moved from one slot to another within bconsole
. Also, any existing Import/Export slots can be addressed conveniently using the import
or export
commands.The tray monitor (a small icon in the system tray of the taskbar) runs on Windows and on Linux systems. The icon flashes to indicate that a backup is currently running on the system.
If a Backup job fails, you can easily to start a job with exactly the same parameters:
*rerun jobid=id
The backup administrator must ensure that all relevant data are retained for a specific period of time. For example, tax-related data might require a retention period of up to 10 years; you must plan carefully.
If you want to separate the data according to various properties, you can use pools in Bareos to do so. Sizes and retention times can be defined for the pools.
Complex Environments
Sometimes, calculating how big a backup will be is difficult. A first approach is to exclude certain directories and data types in the file lists that describe the backup. Alternatively, you can exclude files above a certain size. However, exclusion does not guarantee that a client does not accumulate large amounts of data that needs to be backed up.
Bareos has a client quota that lets you determine the total amount of data to back up for a client. Additionally, you can use soft quotas and grace periods to learn at an early stage when a quota is nearly exhausted.
Keep in mind that large amounts of data might be transported across the network, especially during a full backup. Therefore, Bareos's ability to limit the maximum network bandwidth used per client is useful. The directive Maximum Bandwidth Per Job
needs to be added to the corresponding client entry in /etc/bareos/bareos-dir.conf
:
Client { Name = client2-fd Address = client2 Password = "secret" Maximum Bandwidth Per Job = 512 k/s }
A key innovation is direct support for NDMP (Network Data Management Protocol), the native backup protocol of large NAS devices such as NetApp. Bareos version 12.4 supports full backup and restore, although restoring individual files is still in the testing phase.
A new plugin for backing up Microsoft SQL Server databases has been written that supports full, incremental, and differential backups; it also is in the testing phase.
The next project in the pipeline is backing up virtual machines via the VMware vStorage API. The first steps have already been taken.
Copy Jobs
Backup tapes are still the media of choice for backing up data, but backups on disk also have advantages. Thus, the approaches are often combined: Disk-to-disk-to-tape (D2D2T) backups are common. With this method, the data is first saved to disk, then transferred to a tape by a Migration or Copy job.
Before Bareos v13.2, Migration and Copy jobs were only supported within a Storage daemon (Figure 2). This restriction has been lifted in Bareos v13.2 – data can now be transported between Storage daemons (Figure 3). Thus, you can back up data from different firewall compartments, for example.
A corresponding Copy job can also copy data periodically to another Storage daemon. The data properties can be modified here to store the data without compression on the first Storage daemon but with compression on the second, making it possible to design scenarios such as backup-to-disk-to-cloud.
Passive Clients
Firewalls commonly cause problems when setting up the backup environment. In a normal connection in a Bareos/Bacula environment, the Backup Director would establish a connection to the client and tell it what to save and where. It also connects to the backup Storage daemon and tells it to accept and store the data from the client. Finally, the client establishes the actual data connection to the Storage daemon and sends its data to it.
If the client is behind a firewall, then packet filtering and network address translation (NAT) on the firewall can make a connection from the client to the Storage daemon difficult or impossible. The problematic connection is thus the actual data connection between the client and Storage daemon (Figure 4).
As of Bareos 13.2, this behavior is now configurable. Using the Passive client
option, you set up all connections to start with the server components. The client then only needs to accept connections. The process of opening connections between the Director and client and between the Director and the Storage daemon remains the same, but the actual data connection is now initiated not by the client, but by the Storage daemon. After the connection has been established, the data is, of course, sent from the client to the Storage daemon (Figure 5).
Besides its firewall friendliness, this approach offers another advantage: Because the passive client does not establish any data connections, it does not need working name resolution. In practical terms, name resolution often has been a problem with the conventional method.
Security
In terms of security, Bareos continues using the familiar safety features of Bacula, such as:
- checksum computation for each backed up file and verification during the restore, and
- the ability to encrypt connections between the daemons with TLS.
Additionally, Bareos adds some more interesting security features; for example, you can now choose the encryption method for software encryption. Previously, only AES128 was used. Now, the following methods are additionally available: AES128, AES192, AES256, CAMELIA128, CAMELIA192, CAMELIA256, AES128HNACSHA1, AES256HNACSHA1, and Blowfish.
In addition to the encryption options in the software, you can now directly use LTO tape drive hardware encryption. Since LTO4 encryption is part of the LTO standard, all drives offer this option. Tape drive encryption relies on hardware support and thus has virtually no effect on the speed of your backup.
Whether you use LTO hardware encryption depends on your requirements. It is a particularly efficient option for those who want to outsource their tapes and, in doing so, prevent unauthorized persons from reading them. The passive client option I mentioned earlier also offers safety benefits: Because the connection to the Storage daemon is no longer necessary, the firewalls can prevent all connections into the backup network.
Previously, you could send arbitrary commands to the client through the Director. These commands were backup
(run a backup) restore
(perform a restore), verify
(run a scan job to sync between system data and backed up data), estimate
(estimate the amount of backup data), and runscript
(run a script on the client system).
Now, you can use the Allowed JobCommand
directive to filter these commands on the client. Commands that are not allowed are then not accepted by the client and not executed.
Running scripts on the system to be backed up poses a special security threat. If you cannot completely prohibit this scenario with Allowed JobCommand
, you at least have the option of setting the directory in which scripts and commands must be located through Allowed ScriptDir
. Commands that do not reside in this directory are not executed.
Integration
It must be possible to distribute a backup client as efficiently as possible to the client systems and to run it there with as little maintenance is possible, especially if many different platforms are connected. Thus, Bareos also works with old client versions and supports Bacula File daemon versions from 2.0 (from 2007).
Univention Corporate environments also have a Bareos version for the Univention App Center. Through the UCS interface, you can specify whether or not to back up a computer. The Bareos server configuration is automatically generated from this and then prepares the client configuration.
Bareos also directly provides packages for the open source Windows software management solution OPSI. These packages can be installed on the OPSI server, assigned the appropriate settings, and then distributed to all connected Windows systems. A script then uses the OPSI JSON-RPC interface to create the appropriate Bareos Director configuration.
To complete the basic configuration of the software, Bareos offers a native installer for Windows that sets passwords and even opens the Windows Firewall. The File daemon and tray monitor are configured so that they work immediately.
A very good system for disaster recovery of Linux machines is available from the Relax-and-Recover (REAR) [6] project. This project's approach is twofold. Installed on the system to be backed up, the command
sudo /usr/sbin/rear -v mkrescue
creates a rescue system ISO file of about 60MB, including the active kernel, required driver modules, information on the hard disk setup, and network configuration. In the second step, the complete system is backed up using
sudo /usr/sbin/rear -v mkbackup
(e.g., to a shared NFS directory).
You could do without this second step if you use Bareos for your backups. Instead, a Bareos recovery module is built into the Rescue System so that, after booting the recovery system, you see an option for completely deleting the system and replacing it with the backup.
Quality Assurance
The entire development of Bareos occurs openly on GitHub [7]. Communication is handled via mailing lists. Feature requests and bugs can be posted on the bug tracking system [8]. You can find more information at the Bareos Community page [9].
Three different systems are used for automated quality assurance:
- Build tests based on Travis [10]
- Regression tests based on CDASH
- Tests of the various platforms based on Jenkins and virtual machines
Every commit in GitHub automatically triggers a build process on the Travis CI Bareos repository [11]. The respository is where the source code is compiled, the daemons are started, and a backup and restore is performed, and it basically checks after each commit whether Bareos is still functional. Further tests are carried out on a CDASH regression-based test system [12]. Currently, about 130 different tests check specific Bareos functions.
The development workflow in Bareos envisages that a ticket should not be closed until a regression test has been created for a new property. This is then noted on the ticket.
A new release is only created when packages built for Bareos on an Open Build Server have also successfully passed a test based on Jenkins. In this test, the packages for the various platforms are tested on the corresponding virtual machines. On each platform, the package installation, data backup, and restore are checked automatically.
The Windows packages are built using OBS and cross-compilation. The result is the Windows Installer, and the OPSI packages.
Future
The path taken thus far has earned the Bareos project much encouragement. The decision to build the infrastructure for largely automated packet generation and testing at the start of the project has proven successful. More platforms can now be added with little effort, with the certainty that problems are detected very quickly by continuous testing.
Another positive aspect is that Bareos is developed in a fully open environment. Although Bareos GmbH & Co KG offer commercial subscriptions and support, all additions and new features are developed in an open GitHub project.
The roadmap envisions keeping to the present course for future developments to provide easy access and improved usability for administrators, integration with other projects and distributions, and functional enhancements. Plans to improve the default configuration should make it even easier to get started, and whitepapers will better illuminate certain issues.
A subproject is working hard on developing a configuration API to ensure that certain configuration changes can be carried out at run time without problems (e.g., adding clients). Front ends like Webacula [13] will then be able to expand their functionality easily.