Free backup tool for data centers
BackupwithBacula
One of Bacula's biggest advantages is that it doesn't continually vie for the administrator's attention. Instead, it goes about its work quietly and reliably in the background. Once configured correctly, Bacula will run for a long time without any administrative overhead. And, even in a worst-case scenario where the backup server crashes, you don't lose anything. Instead, you can restore directly from the backup media. In other words, if your priorities are stability, reliability, and robustness, Bacula is the tool you've been looking for.
If you work with Bacula, one thing will immediately grab your attention: In the Bacula system, each task is handled by a separate program. Tasks can include reading the data to be backed up and transferring them across the network, writing to the backup media, or updating the catalogs. A separate daemon exists for each of these tasks in Bacula:
- The file daemon runs on the computer you want to back up and reads the files in order to transfer them across the network. In a restore scenario, it receives the data and writes them out to disk.
- The storage daemon addresses the storage media in the Bacula system. It receives the data read by the file daemon and stores it on the backup media. In a restore scenario, the process is reversed: It reads the data from the media and sends it to the file daemon.
- The director daemon manages the information in the catalog database. For each backup, the catalog database stores the details of which file daemon stored which backup on which medium. The director daemon is also responsible for scheduling and starting the backups on time. It also receives and processes messages, statistics, and reports.
- The Bacula console is a simple user interface for Bacula and gives the administrator access to the director daemon. This gives administrators an interactive option for launching backups and restores, and for querying the status of all the daemons in the Bacula system.
Figure 1 shows the four components of the Bacula system. Communication between the various Bacula daemons is handled via three ports registered with IANA (Table 1).
Tabelle 1: Bacula Ports
Name |
Port |
Description |
---|---|---|
bacula-dir |
9101/tcp |
Bacula Director |
bacula-dir |
9101/udp |
Bacula Director |
bacula-fd |
9102/tcp |
Bacula File Daemon |
bacula-fd |
9102/udp |
Bacula File Daemon |
bacula-sd |
9103/tcp |
Bacula Storage Daemon |
bacula-sd |
9103/udp |
Bacula Storage Daemon |
Configuration
Each element on a Bacula system has its own configuration file. The configuration file syntax is the same in each case. Each Bacula configuration file comprises one or multiple resources. Each resource comprises configuration directives in keyword = value
pairs and can contain sub-resources. Here is the generic configuration format:
ResourceType { keyword = value keyword = value Sub-ResourceType { Keyword = value Keyword = value } }
Each resource definition starts with the resource name, followed by the resource content as a keyword = value
pair; some resources can have additional sub-resources. Depending on the keyword, the value part can be a test, a numeric value, or a pointer to another resource; in the latter case, the name of the referenced resource is entered.
Listing 1 shows a couple of genuine resources from a director daemon configurations file. The director resource configures the daemon itself. The fileset resource is a good example of a resource with sub-resources. The director resource points to a messages resource by the name of "Daemon" in the Messages = Daemon
line.
Listing 1: Excerpt from a Director Configuration
01 Director { 02 Name = bacula-dir 03 Messages = Daemon 04 Password = "YdhKKoy2Huq1CVHwIR" 05 Pid Directory = "/var/run" 06 Query File = "/usr/lib/bacula/query.sql" 07 Working Directory = "/var/lib/bacula" 08 } 09 10 Fileset { 11 Name = "Full Set" 12 Include { 13 File = /usr/sbin 14 Options { 15 Signature = MD5 16 } 17 } 18 Exclude { 19 File = /var/lib/bacula 20 File = /tmp 21 } 22 } 23 24 Messages { 25 Name = Daemon 26 Append = "/var/lib/bacula/log" = all, !skipped 27 Console = all, !skipped, !saved 28 Mail = root@localhost = all, !skipped 29 Mail Command = "/usr/sbin/bsmtp -h localhost -f \"\(Bacula\) \<%r\>\" -s \"Bacula daemon message\" %r" 30 }
After completing the initial configuration for communications with the directory, no further changes are typically needed. Manual attention to the file daemon or console configuration is not necessary, and changes to the storage daemon are very rare once the Bacula system is up and running. The most frequent changes relate to the Bacula director settings, which are essentially the configuration settings for the overall system. The following list explains the resources in the director configuration file and their tasks:
- Director – Configuration parameters for the director itself are set in the director resource. The central directives are the name and password used by the Bacula console to communicate with the director. The director resource occurs only once in the configuration.
- Storage – The director's storage resource configures the address and the password the director uses to reach a storage daemon. It also specifies the device and media type configuration for this storage location.
- Catalog – The catalog resource describes the director's access to the catalog database: database name, database server name, user name, and password. The database stores information on all the backed up data.
- Messages – Bacula has a sophisticated message distribution system. Messages generated by the system can be managed flexibly. Each message belongs to a message class; messages can be filtered by their classes. The messages resource is an important parameter in message processing. The sample resource by the name of
daemon
defines the following: All messages with the exception of theskipped
class are written to the/var/lib/bacula/log
file. Messages belonging to theskipped
orsaved
classes are output at the console. Everything apart fromskipped
is mailed toroot@localhost
. TheMail Command
defines how to do this. - Pool – Bacula's media management groups identical media into pools. The target of any backup is a pool; Bacula then chooses a suitable media from the pool.
- Console – A console resource is used to define console access in addition to full access to the director. ACLs can be used to restrict access to certain clients, pools, commands, and other Bacula resources.
- Client – The client resource tells the director the name, password, and address at which it can reach a client daemon. Instead of
Client
, the synonymFileDaemon
can be used. When another computer is added to the backup, this action is done by creating a client resource. - Fileset – A fileset states what to back up. The resource contains the name of the directories to be backed up and patterns for files and directories to exclude from the backup.
- Schedule – A schedule resource supports time planning and describes when a backup should occur. The scheduler in Bacula is extremely powerful and can handle a variety of interval types (Table 2). If the time parameter is missing, the system assumes
always
. The maximum time is 1 year. - Job – The job resource defines the actual backup and groups of the resources to do so. For example, which computer needs to be backed up (Client)? What needs to be backed up on this computer (Fileset)? When should the backup take place (Schedule)? And, what media should the backup use (Pool)? What device should be used to write the media (Storage Name)? Besides the references pointing to other resources, a job can define many other configuration directives.
- JobDefs – To reduce the configuration overhead, Bacula can define JobDefs resources. These act as templates for job resources and contain all the configuration options for a job. Directives that are specified directly in the job override the default in the JobDef resource. JobDefs can be used in production to configure centralized job groups.
Tabelle 2: Scheduler Intervals
Time Unit |
Example |
---|---|
Hour, minute |
at 23:05 |
Day of month |
12 |
Day of week |
Mon |
Week of month |
2nd |
Week of year |
w04 |
Monday through Saturday |
mon-sat at 23:10 |
First Monday in month |
1st mon at 23:10 |
Monday in the 1st week of the year |
mon w01 at 23:10 |
Besides the three main Bacula components (director, storage daemon, and file daemon), the Bacula package also contains other tools that run as standalone programs (Table 3).
Tabelle 3: Utilities
bcopy |
Copies Bacula media |
bextract |
Can open Bacula media and extract files |
bscan |
Can reconstruct the CatalogDB from Bacula media |
bsmtp |
Bacula SMTP client |
btape |
Program for testing tape drives with Bacula |
btraceback |
Program for collecting information in case of a crash |
bregex, bwild |
Programs for testing regular expressions or wild cards |
dbcheck |
Program for managing and plausibility checking the catalog database |
Daemon Details
The Bacula file daemon is available for many operating systems – just about any flavor of Linux, Unix, Windows, and MacOS. Besides backing up files, the file daemon can also back up filesystem ACLs. On Windows, it addresses the VSS so that consistent backups of all VSS-capable applications can be created automatically.
Of course, the Bacula file daemon can run scripts before and after the backup. The script output and return values are taken into consideration and added to the backup report. The Bacula file daemon can also compress and encrypt the backup data before sending it to the storage daemon.
Encryption occurs transparently, however, as only the file daemon with the right key can restore the data. A master key kept in secure storage avoids the risk of being unable to access the stored data if the key is lost.
The Bacula storage daemon writes data to backup media, which can include hard disks, single tape drives, and tape libraries. To ensure high throughput and operations, the Bacula storage daemon can cache the data in a spool directory and then transfer the data at a single pass and at high speed to the tape. A script is used to control tape libraries. The script can talk to basically any device that supports command-line based controls.
Additionally, the storage daemon offers the option of copying data from one storage medium to another, thus supporting migration and virtual backups.
The Director Daemon
The Bacula director relies on support for various databases for its catalog: SQLite, MySQL, or PostgreSQL. Access Control Lists (ACLs) are useful for restricting access to resources, especially in large environments (Table 4), making sure that certain administrators work only on a certain group of servers and, if necessary, only back up and restore data to these specific servers.
Tabelle 4: Bacula ACLs
ACL name |
Meaning |
---|---|
Catalog ACL |
Restriction to specific catalogs |
Client ACL |
Restriction to specific clients |
Command ACL |
Restriction to specific console commands, e.g. restore only |
Fileset ACL |
Restriction to specific filesets |
Job ACL |
Restriction to specific jobs |
Plugin Options ACL |
Restriction to specific plugin options |
Pool ACL |
Restriction to specific pools |
Storage ACL |
Restriction to specific devices |
Where ACL |
Restriction to specific restore paths |
Restarting the Bacula director while a backup is running is particularly critical; however, the director can parse most changes to the Bacula configuration at runtime without interrupting the backup.
The Bacula Console
Although the Bacula console is a command-line interface (CLI), it is very convenient. The help
command gives the administrator an overview of the available commands at any time. When you execute a command, the required parameters are prompted for interactively in a menu.
If you compile the Bacula Console with readline support, the Bacula console has the same level of convenience as the Bash shell. Both tab completion for commands and parameters and a searchable history are available.
You can easily automate the Bacula console: Simply pipe the sequence of commands to STDIN, and the output occurs on STDOUT.
Accurate Backup
Besides simple full, incremental and differential backups, Bacula also has some interesting non-standard options. In normal operations, the number of backup jobs is typically far greater than the number of restores. In high-security environments, the file daemon can be launched in a mode that doesn't support write operations and only allows reads. This prevents manipulation of the target system even in case of a backup system compromise. The file daemon obviously has to be restarted with write capability before you can restore data.
Like most other backup solutions, Bacula investigates the timestamp of the last backup and the timestamp of the files for incremental and differential backups to decide whether to back up specific files. This is a tried and trusted principle, but it can cause problems in some cases. For example, if you create files with an out-of-date timestamp, they will never be backed up. Also, Bacula will never notice that files have been deleted; in other words, a restore operation will always create files that didn't exist at backup time.
The Accurate Backup mode is Bacula's solution for avoiding all of these issues. Before creating a backup, a list of all known files and their sizes, permissions, and checksums on the system is transferred. The file daemon compares this list with the filesystem and backs up the difference. On the downside, Accurate Backup needs far more resources than the legacy backup on both server and client side.
Virtual Backups
To improve redundancy and assure compliance with legal requirements, storing backups externally is often necessary. Copy jobs make it easy to copy data from external storage onto tapes. As long as the data are available locally, Bacula will always access the original backups; if the data are no longer available locally, Bacula will request the external backups. You can also swap out backups onto external media for long-term archiving purposes. The source media are then released after the migration.
Full backups are normally performed for data backups at regular intervals. They are very large and correspondingly consume much time and network capacity. Most of the data in a full backup will typically already be available on the media from previous backups.
Virtual Full Backup accesses existing full, incremental, or differential backups and combines them with the changes since the last differential backup to create a new full backup. This approach reduces the load on the network and the client to the load created by an incremental backup, which will typically be less than 10% of the load caused by a full backup. If you decide to use Virtual Full Backups, it is a very good idea to use Accurate Backup also.
Deduplication is also a topic for Bacula. Base jobs let you save huge volumes of data if you have many identical systems. The base job defines a common basis on which other backups are built. The existing base job data is only stored in the base job and is not backed up again.
Plugins
Because it supports VSS on Windows, and because of its use of scripting, Bacula can create consistent backups of many programs without any additional tools. However, some programs require special attention to create a consistent backup and provide proprietary interfaces for the purpose.
To address an interface of this kind, the Bacula file daemon has the ability to integrate backup plugins. These plugins communicate with the backup interface provided by the program vendor.
For example, the NDMP plugin backs up and restores NAS servers by a standardized interface. For Windows operating systems, it offers the option of backing up the system state and can also back up MS Exchange, MS SQL Server and SharePoint with corresponding plugins.
Another interesting plugin supports deduplication at the block level, thus making it possible to back up very large files far more efficiently. Backups of applications such as databases and virtualization solutions will benefit greatly from this capability. Additionally, interacting with software from giants such as SAP and Oracle is no problem for Bacula, thanks to its plugins.
GUIs
Besides the Bacula Console, various other programs are available for managing Bacula. Bacula-Web, Webacula, and Bweb are three web-based programs. Bacula-Web [3] is easy to install and gives the administrator an overview of the system state; however, it cannot interact with the system.
Webacula [4] is far more powerful and, in addition to offering detailed system information, also has the ability to initiate backups and restores. Bweb can be downloaded from the Bacula website at [5] in the form of the bacula-gui
archive. Bweb is developed by Bacula Systems and, like Webacula, is very powerful. In the enterprise version, Bweb also supports multiple director daemons.
The Bacula project also offers a native Qt program under the name of Bat (Bacula Admin Tool). Bat runs on various operating systems and is very powerful; it is extremely well suited to restore operations.
The version browser in Bat is a very practical tool. You can use it to view the files stored in a virtual filesystem tree (Figure 2). Additionally, all stored versions of this file are displayed to allow for a selective restore. You can tell from the file checksum whether or not the file has changed.
Conclusions
Although Bacula is powerful, it could be improved in some areas. For example, the number of plugins for commercial software is still fairly low. Having said this, Bacula Systems has already identified this issue and is planning to release a large number of new plugins.
Also, the process of configuring the system and text files is complex and can be error-prone on large systems. I am currently developing a program called dassModus
that addresses this issue by supporting modifications to the configuration files for the Bacula system in a graphical interface.
Despite these challenges, Bacula is a free, stable, mature backup solution that has proved its value in large-scale environments. It is reliable and virtually maintenance-free.
If you require more details, visit the project website or attend the annual Bacula conference, which is being held along with the Bacula developer conference this year [6]. Additionally, Open Source Press intends to release a Bacula book this year.