The NDMP protocol and alternatives for filer backup
Freeway
However you do it, when you back up a server, you need to run some kind of software on the system that sends data directly to a medium or to a centralized backup server. Preconfigured servers – including NAS filers – typically prevent the user from installing any kind of additional software. This makes it difficult for the administrator to enforce a backup solution.
One approach is to use the backup agent provided by the NAS vendor. Some NAS systems come with a backup utility; however, this approach binds you to whatever backup solution the vendor decides to include. Even integrated agents could never hope to cover the scope of backup applications available on the market.
Another solution to this problem could be to mount the volume you want to back up on a backup server and then run an agent, but this approach has its downside. For example, all the data in the backup needs to cross the network wire to the backup server, and if you don't have a dedicated connection, impairments are guaranteed because of the sheer volume of data. Also, you might be infringing on security policies by mounting volumes on a server.
To cut this Gordian knot, 1996 Intelliguard (later acquired by Legato, which in turn belongs to EMC today) and Network Appliance (NetApp) joined forces to develop the Network Data Management (NDMP) protocol. The NDMP standard enables protocol-compliant data servers that read from or write to disks to send a data stream to a tape server, which manages the backup medium.
Today, many storage systems by NetApp, HP, EMC, and others understand the protocol: The NDMP website [1] lists a dozen or more vendors of NDMP-compatible backup software, including major players such as IBM, EMC, HP, Symantec, CA, or Fujitsu-Siemens. The NDMP module in the backup program thus is a standardized counterpart for the disk array and the tape library with which it can communicate directly. It is the array agent's responsibility to prepare or restore the data efficiently, whereas the backup software handles scheduling, provides a GUI to the user, and manages a catalog of backed up files. A control module triggers the process and stores the logs and history.
NDMP Details
When a backup is handled by NDMP, the following happens (Figure 1):
1. The NDMP control program first contacts the tape server via TCP port 10000, authenticates, and ensures that the required tape is loaded in a drive. If necessary, the tape is initialized and labeled.
2. The control program talks to the data server, proves its identity, and configures some connection data.
3. The data server now opens a direct connection to the tape server and sends a stream of backup data via the connection. The logs on both sides and transferred file history are sent to the control program.
4. Where needed, the control program initiates tape changes in the course of the backup, relying on the tape library's media changer support.
5. The data server tells the control program that the backup has been completed. The connections are closed.
A restore utility follows a similar procedure.
Alternatives
The current NDMP version is version 4. Various extensions are in the pipeline for the successor version, including the ability to handle multiple sources and targets, backup and recovery checkpoints that support a restart after an interruption, snapshot management, improved authentication, and improved compatibility with firewalls and NAT environments. However, development seems to be dormant right now; the last standardization documents for version 5 are about 10 years old and look very much like drafts.
When we inquired about this, the IETF replied that the standardization drive had terminated in 2003. Perhaps talking to NDMP.org would reveal more, they suggested. But questions we sent to the contact address at the ndmp.org site were not answered at all.
This is reason enough to look around for alternatives, and one alternative comes from backup vendor SEP [2]. It uses an API provided by NetApp [3] to control its filers remotely. The backup software can use this API to mount the storage and show its directories in your browser (Figure 2). In the course of a backup, a snapshot is first created via the API, and the snapshot is then backed up. SEP recommends using a separate LAN connection for this to avoid the backup data interfering with the rest of the network.
This approach also has some disadvantages: for example, it only works with NetApp or compatible storage, whereas NDMP is vendor independent. It also forces you to mount all the volumes on the backup server, and it puts some load on the local network if you don't have a separate connection between the filer and the backup server.
Wherever it is possible, and assuming you only need to back up NetApp products, the SEP solution is an interesting and highly functional alternative.