Features Rescue Systems Lead image: Lead Image © Stephen Coburn, fotolia.com
Lead Image © Stephen Coburn, fotolia.com
 

Four rescue systems compared

Emergency Response

Sys admins turn to rescue systems when faced with difficulties. Four alternatives – Grml, Rescatux, Knoppix, and SystemRescueCd – show what they can do. By Martin Loschwitz

One unpleasant experience for any system administrator is a server that does not respond as expected. Difficult questions follow: Why did the reboot fail? How can the server be resuscitated? The first question is especially difficult to answer, because if the computer won't even start, it is impossible to log in to diagnose the problem. Enter rescue systems: They come in many flavors, each with its own particular strengths. In this article, I sound out four flavors of rescue Linux: Grml, Knoppix, Rescatux, and SystemRescueCd.

Bootstrapping for an Emergency

A couple of questions need to be clarified before a test can produce meaningful results: What functions do rescue systems need to fulfill? What workflow should the administrator set up in advance so that the rescue system is ready in case of an emergency?

In the past, systems like Grml or Knoppix served as useful companions to many administrators when it came to bringing computers back to life that could not be booted. The systems resided on a CD or, in the case of Knoppix, a DVD. However, the rules of the game have changed in recent years: Five years ago, it was quite common for servers to be delivered with CD drives, but you often search in vain to find them on today's systems.

An optical disk drive is now irrelevant in practice: The media rarely play a role when administrators try to revive their systems in an emergency. All current servers boot from flash media, such as a USB flash drive or SD card, although admins rarely boot from portable media now; they normally use one of the management frameworks from the major manufacturers (e.g., HP iLO, Dell DRAC, IBM RSA) instead of hotfooting it around a data center with a USB flash drive.

These systems work independently of the operating system on the host and boot the server on demand from any medium. In case of an emergency, a reboot can also be performed using the generic IPMI protocol; a combination of PXE and TFTP servers then ensures that the computer boots to the rescue system and not to the broken OS. Administrators would do well to set up such an infrastructure.

Requirements

The sense and purpose of a rescue system is always to allow access to the broken system for repairs. However, a few requirements must be fulfilled for this to work. First, the rescue system should support current hardware in the best possible way. After all, a booted emergency system will not be a big help if it does not have drivers for the RAID controller and therefore does not recognize the existing disks.

Most test subjects therefore regularly publish new versions with updated kernels. However, that is only half the battle: Current servers sometimes require special additional drivers or firmware that might not have made their way into the rescue system for licensing reasons.

In a worst case scenario, it might be necessary for administrators to build a corresponding kernel themselves based on the rescue system. Rescue tools therefore must provide the option to download additional components or to distribute a modified version of the original image immediately. After all, rescue systems need to support as many technologies as possible: Encrypted software RAIDs or LVM are the rule rather than the exception on servers.

Knoppix – Oldie but Goldie

Knoppix [1] was the first rescue system to be tested (Figure 1). The system has been around for more than 13 years, and anyone who has visited the annual CeBIT lectures by developer Klaus Knopper should have one or two Knoppix DVDs in their collection. Interestingly, Knoppix was not initially designed to serve as a rescue system for server administrators. Instead, Knopper's intention was to familiarize inexperienced users with Linux and Debian without having to install an operating system.

Knoppix is a complete 3D desktop with all the trimmings; however, the system's rescue capabilities should not be underestimated.
Figure 1: Knoppix is a complete 3D desktop with all the trimmings; however, the system's rescue capabilities should not be underestimated.

In the days before Knoppix, distributions did not have the perfectly functioning graphical installation routines that are common today. Anyone who wanted to install Debian had to fight their way through a number of text dialogs. Newcomers were therefore faced with almost insurmountable obstacles.

Knoppix became successful very quickly, because the user only had to boot from the CD to get a functioning Linux system in the blink of an eye – without even touching a local hard disk or running the risk of bricking the Windows installation that often resided on the disk. Many observers are now convinced that Knoppix made the Live system principle socially acceptable.

However, Knopper was later confronted with requests to make Knoppix permanently installable on hard drives, often causing greater problems. Later, major Linux manufacturers (e.g., SUSE and Canonical) jumped onto the Live CD bandwagon. Anyone installing Ubuntu or SUSE today can boot from a Live system and start the installation routine for the operating system from a CD, DVD, or flash drive.

Everything You Need

The reputation of being just a desktop system still clings unfairly to Knoppix. Although it starts the LXDE graphical user interface in standard mode out of the box, a text mode without X11 is also available. This is especially important because it is not much fun operating a graphical interface through the VNC client of IPMI, DRAC, iLO, or RSA.

Once you have logged into Knoppix, several options present themselves. You will usually want to access the rescue system via SSH to get rid of the VNC console. This is no problem with Knoppix – an SSH server is included. The network configuration can also be changed on the fly, as long as you are logged in locally via VNC console.

Knoppix impresses with its cornucopia of tools for system administration. LVM confidently handles the distribution so that administrators have access to all logical volumes (LVs). The same applies to software RAID configurations with the MD/RAID driver. As soon as the logical drive – be it an LVM LV or MD/RAID device – is working, chroot is also available. You can then tinker with the broken system as if it had booted.

Having 32- and 64-bit binaries bundled on the Knoppix DVD image is also practical. You can be sure that you will be able to process any system with the same DVD and not need separate recovery media. However, the versatility has a snag: Version 7.4.2 of Knoppix is 4GB and, as you will see, clearly exceeds the size of its competitors.

Knoppix is made of a mixture of Debian "stable" topped off with packages from the testing branch. The kernel for Knoppix 7.4.2 is the no longer totally new 3.16; Knoppix version 7.5 takes it up to the more recent 3.18.6.

Modified Versions

Knopper provides a detailed guide on the Knoppix website for creating a locally modified version of the DVD [2]. This process might take awhile, but it also ultimately gives you perfect results that are adapted to your circumstances.

Anyone who needs a special kernel module in Knoppix should follow his instructions: The corresponding module that must then be installed in a second unzipped Knoppix hierarchy can be built from an unzipped Knoppix. This second version of Knoppix serves as a golden master for the new DVD, which is available for use at the end of the process.

Grml: All-Purpose Weapon

Just mentioning Grml [3] (Figure 2) can cause administrators to nod sympathetically. Little wonder, because its inventor Michael Prokop designed Grml as an all-purpose tool for administrators and has stayed faithful to this line for years. Grml was created in response to specific needs: To give administrators exactly the tools they need for everyday use for everything from debugging systems to reviving failed servers and performing other maintenance work that is impossible from within the production system.

The classic: Grml is the most comprehensive tool for administrators wanting to resuscitate an ailing server.
Figure 2: The classic: Grml is the most comprehensive tool for administrators wanting to resuscitate an ailing server.

Grml is virtually the counter-concept of Knoppix: Whereas everything was colorful and graphical with Knoppix in the beginning, and the emergency options were only added gradually, Grml was designed for professionals from the beginning.

To this day, Grml only contains a rudimentary graphical interface. If you insist on X11, you might be able to use Fluxbox as a window manager, but ultimately X11 is more of an annoyance when you are trying to save servers. Grml might have reduced graphics because large flash memories were rare in its early days.

Today, Grml can be found in three forms: Grml32 and Grml64, each about 350MB and intended for 32- and 64-bit systems, respectively, and Grml96, which combines both images, yet still fits on a CD. You will never have to download more than 700MB to fill up your toolbox.

Grml does not show any weaknesses in terms of functionality. Almost any conceivable sys admin task can be performed using the system: including LVM and software RAID drives, as well as devices encrypted with Dmcrypt. An SSH server is onboard to avoid the need for remote VNC. The Z shell, which is the default in Grml, might be an unusual choice for many administrators, but it really doesn't cause a problem in everyday life. Anyone who is accustomed to Bash can cope with Zsh.

Bootstrapping with Grml

The extras in Grml are almost as useful as the basic feature set. The fully automatic installation (FAI) is a Debian-specific bootstrapping system for bare metal deployment. A running Grml instance can become a complete FAI setup, as required. The grml-live tool, from the team led by Michael Prokop, builds on the FAI function and serves as the base for Debian-based Live distributions.

The Grml-Live framework makes remastering Grml CDs easier. Anyone needing a locally modified Grml version for specific hardware or kernel packages will be able to achieve their goal quickly using Grml-Live. Grml mainly comprises packages that users can find in the official Debian archive. Debian's cornucopia is available if you want to build a local Grml variant. For example, you can easily build a Grml image with a newer kernel. Anyone who needs special modules can load them, either in the form of a Debian package into the Grml run time or integrated into a local Grml image.

Grml is a jack of all trades that, if necessary, acts as a terminal server and distributes Grml to all other computers on the network via PXE. Instructions for various topics related to Grml can be found on the Grml wiki [4]. Prokop has remained true to form: the Grml website does not provide information regarding how to install Grml on a disk.

Because Grml is not intended to be an everyday desktop system, anyone who wants to go that way should, according to Prokop, use a real Debian system. Then, you typically only have to deal with ironic release name that Prokop assigned to Grml: Version 2014.11 was current at the time the magazine went to press and bore the name "Gschistigschasti," which means "clutter" in the Austrian flavor of German.

SystemRescueCd: Back to the Roots

SystemRescueCd [5] is an appropriate name for the next project; it is a succinct description of the tasks its authors supply on a Live CD or USB stick. Thematically, SystemRescueCd is more in line with Grml than Knoppix. The content of the image, at about 430MB, is limited to the essentials. SystemRescueCd does not aim to be a Live desktop system; rather, it is specifically a system for dealing with emergencies. The diversity provided by the solution is exciting.

The CD boot screen (Figure 3) shows that SystemRescueCd pays attention to more the Linux universe. Anyone who needs a DOS environment for updating firmware on a server device using Stone Age tools can simply boot to FreeDOS. The memtest86 memory-testing software is also an option.

SystemRescueCd boots Linux in the default configuration, but FreeDOS and MemTest86 are also options.
Figure 3: SystemRescueCd boots Linux in the default configuration, but FreeDOS and MemTest86 are also options.

SystemRescueCd is based on the Gentoo Linux distribution and contains an X11 environment that includes Firefox and a number of graphical tools that add to the appeal of SystemRescueCd. The graphical user interface can be also switched off. Changes to the partition tables, such as reinstalling GRUB or checking hard drives, work directly from the booted system.

Filesystem backups are no problem either – users may even choose whether they would prefer to export the images to a connected storage device, burn them to an optical medium, or store them via the network. If you want to know whether your hard drive is still sane, you will appreciate the TestDisk tool.

Better Desktop than Server

SystemRescueCd seems to be aimed more at desktop users than at server administrators, in that it uses kernel 3.14, which is no longer quite up to date. Common desktop controllers typically still work well with older versions of the kernel, but that's not always the case with new storage controllers on servers. However, SystemRescueCd boasts basic server features; for example, it understands LVM very well.

One nifty feature is the ability to build your own SystemRescueCd [6]. Knowledge in dealing with Gentoo is useful, because additional software is installed using the Emerge tool, the command-line interface to the Gentoo Portage package management system.

Rescatux for Complex Tasks

The final rescue system to discuss is Rescatux (Figure 4) [7], which differs from the others by catering to beginners, unlike Grml and SystemRescueCd. Therefore, Rescatux is not meant to be a comprehensive recovery tool for server administrators facing an emergency. Instead, Rescatux aims to make it as easy as possible for users to get their broken Linux desktop installations up and running again.

Rescatux is the surprise in the test: Rescapp is oriented toward inexperienced users and performs many tasks with just a few clicks.
Figure 4: Rescatux is the surprise in the test: Rescapp is oriented toward inexperienced users and performs many tasks with just a few clicks.

Whereas Knoppix essentially provides all the tools you need to resuscitate a broken desktop Linux, albeit with a bit of sysadmin experience to use the tools appropriately, Rescatux boots into Rescapp, a GUI that displays various buttons for diagnosing and fixing common system problems. For example, you can rewrite GRUB to the bootloader (Figure 5) or enforce a filesystem check.

Rescapp formulates clear questions for users who are not professional administrators and sets up GRUB based on the answers.
Figure 5: Rescapp formulates clear questions for users who are not professional administrators and sets up GRUB based on the answers.

The list of functions that Rescatux combines in Rescapp is impressive. Above all, however, it is a tool for repairing broken GRUB installations. The development of Rescapp was initiated by the change from GRUB 1, which let you switch to a command line if the originally installed GRUB was faulty, to GRUB 2, which does not: You must have a running Linux system to repair a broken GRUB 2. With Rescapp, you retain the capability to restore GRUB.

After Rescatux boots, Rescapp automatically opens in the LXDE desktop; clicking the Grub (+) line unfolds the restore menu for the Linux bootloader.

Clicking Restore Grub takes the user directly to a wizard: In the first step, the wizard asks which operating system you want to display as an entry in the GRUB configuration. The tool is based on Debian and uses the os-prober application, which is part of the Debian installer, in the background.

In the next step of the wizard, the user specifies the disk on which the bootloader should be installed before specifying the order of the disks in the final step (a new device.map is created from this information in the background). A final click sets the Rescapp mechanism in motion. GRUB should then execute on the system as usual.

Linux, but Not Just

In addition to handling all sorts of GRUB commands, current versions of Rescatux can force a filesystem check (including ext4 and XFS), change the root password, and regenerate an accidentally deleted sudoers file. Various tools for experts include GParted for disk partitioning or extundelete at the command line for recovering accidentally deleted files.

Some important commands cater to Windows systems. For example, anyone who wipes out the Windows MBR while setting up a dual-boot setup can restore it using the Rescatux CD. Rescapp also changes the administrator password on Windows systems, and it is even possible to add a Windows user to the administrator role. If you accidentally lock your Windows account, you can solve that problem in the Windows section. The most current Windows functions are still listed as beta in the tool's documentation – which means the developers cannot guarantee that they will work properly. Although the Rescatux approach might seem unusual at first glance, on closer inspection, the rescue system demonstrated its usefulness.

Rescatux is not suitable for administrators working on servers in the data center. Grml, Knoppix, or SystemRescueCd are best used by this group. Rescatux is more oriented toward inexperienced users who just want to use their broken Linux installations again.

Conclusions

A review of current rescue systems reveals great diversity. Grml clearly positions itself in the server section; Rescatux, on the other hand, is ideal for desktop system users. SystemRescueCd holds the middle ground, offering graphics, but it also works satisfactorily on servers. Knoppix is a special, but useful, case if the 4.7GB download doesn't pose a problem, The comprehensive desktop system has all the server recovery trimmings you could imagine.