When it comes to kernel version or security updates in Linux, most admins trust an ancient binary procedure: They install the updated kernel packages provided by their distributor of choice, or they build a new kernel and restart the system.
Anyone who has followed kernel updates of the various distributions in recent months and years will come to the conclusion that the legendary Linux uptime is only feasible if you do not install kernel patches and thus accept the associated vulnerabilities and other risks.
No Way! Rebooting a Cluster
To provide new kernel functions or security fixes, you need to reboot, but although this process is performed in the background thousands of times a day all over the world, it can create havoc that any administrator would prefer to avoid.
If the server you need to restart belongs to a cluster, for example, you need to take great care to avoid Pacemaker or some other cluster manager unintentionally identifying a failure and initiating an emergency response. Cluster admins will usually want to migrate running services manually to other systems before the reboot.
The reboot not only means more work but often downtime as well, and admins always need to mitigate the effect of service downtime. For this reason and others, IT professionals around the world seek to avoid reboots, even if they "only installed a new kernel."
Other groups would also be happy to avoid reboots. Kernel and driver developers could work more efficiently if they did not have to reboot after each code update, so hot patches are at the top of their wishlist.
Until now, hot patching was a fantasy on Linux. Recently, though, both SUSE and Red Hat launched solutions that will make kernel patching possible during operation. However, neither SUSE nor Red Hat invented the principle: Oracle has offered Ksplice  – a kernel patching solution – for some time, but it suffers from various health problems and its license conditions do not exactly inspire confidence. Additionally, Oracle now reserves the tool exclusively for its own business customers; the open source variant of Ksplice has not been under development for some time.
Both Red Hat and SUSE offer alternatives to Ksplice that claim to be equivalent in terms of functionality: Kpatch  and kGraft . Technically, the two approaches have major differences with widely divergent functionality. Perhaps this divergence is why Red Hat and SUSE continue to press ahead with their own approaches, rather than agreeing on a single development path.
kGraft by SUSE
kGraft comes from SUSE's own development department. The concept has achieved production maturity, and the vendor has therefore gone public. kGraft leverages a number of functions that modern versions of the Linux kernel support, including int 3 trap calls, ready-copy updates (RCUs), and memory profiling by
mcount. At the end of the day, this potpourri of approaches gives kGraft the skills it takes to replace code in the kernel with other code on the fly.
What then follows is technically highly complex. For example, kGraft draws massively on the fact that the profiling code from GCC leaves some space at the beginning of each function when compiling. This area is initially used by calls to the
ftrace() function via an
__fentry__ call. However, these are replaced by NOP entries at boot time so that each function starts with a deliberate space.
kGraft uses precisely this space, replacing it with int 3 handlers that can immediately jump to a different part of the kernel code when calling a specific function without taking any intermediate steps. That is, calling a function in a kernel with support for kGraft always means that kGraft itself is called, so kGraft is omnipresent.
Patches as Kernel Modules
In kGraft's case, then, how is the code to which the handler needs to jump compiled into the kernel? The SUSE developers decided to provide patched kernel code in the form of ordinary kernel modules. For users to replace code on the fly, they need a matching kernel module with a newer version of the desired function. Only functions in the kernel can be replaced by kGraft; you cannot modify the internal data structures of the kernel with kGraft.
You load the kGraft kernel module just like any other module. In this way, kGraft updates can be integrated into the existing package architecture of many distributions. For SUSE, Red Hat, Debian, and Ubuntu, it is ultimately nothing new to deliver kernel functions in the form of additional module packages. Until now, the process has affected only drivers that were not previously in the kernel, but if you look at the kGraft module interfaces, they are identical to normal kernel modules.
This principle makes it possible to distribute kGraft patches in the form of packages via any installer system and load them into the running kernel with
insmod. If you use Puppet or Chef, the entire process can even be automated. It's hard to imagine a more convenient way to implement in-place updates of kernel functions. Thus, you gradually collect modules until you reboot. Whether or not this ever-growing stack of modules causes any problems has yet to be seen.
Red Hat's Kpatch
Red Hat also presents a solution that lets you replace kernel code on the fly. The Red Hats call it Kpatch , and just as SUSE developed kGraft on its own, Red Hat developed Kpatch from scratch. Neither kGraft nor Kpatch borrow directly from Oracle's Ksplice, although in some respects Kpatch follows the fundamental decisions that kGraft also uses.
First, the similarities: Like SUSE's kGraft, Kpatch latches into the kernel via the Ftrace subsystem that GCC automatically includes at compile time. Ultimately, Kpatch thus fields function calls in a very similar way. Unlike kGraft, however, Kpatch is a kernel module itself; admins do not need to patch the host kernel.
This hot patch mechanism provides benefits in scenarios in which admins want to replace the patch mechanism itself with a newer version. With SUSE, this implies a reboot, but not necessarily for Red Hat.
Kpatch injects two modules into the kernel:
hot patch module contains the Kpatch functionality and
kpatch core module provides the user interface by which patches are then injected into the kernel later on.
Et tu, Kpatch?
Even Kpatch expects updates for functions in the form of kernel modules that are loaded into the host kernel. When you call a function that is available in a more recent version of a Kpatch patch, the function call is then forwarded to the "new" version of the function. The Kpatch core module serves as a central patch registry and knows whether Kpatch has installed a patch for a specific function; depending on this knowledge, it executes the old or new code.
Incidentally, Red Hat delivers a tool called
kpatch-build that generates an updated module from a patch for Linux, and Kpatch also includes a utility that conveniently manages the existing hot patches. You can even use this tool to define patches to be reloaded after a system reboot.
At first, this method sounds as if it were contrary to the Kpatch approach as a whole, but in reality, it is very smart: If an admin has a working combination built from a stable kernel and additional patches, and the power then fails – making a reboot unavoidable – they might not want to face an additional problem caused by a kernel update. The better way might be just to recreate the known environment and load the update modules with the Kpatch utility.
In everyday life, you see more details when comparing the two solutions: Kpatch works more thoroughly than kGraft. In the kGraft example, the patch for a function only fields calls to the function that occur after applying the patch. Functions that were running before remain unaffected, which means that several versions of the same function could exist in kernel space. This will probably be a problem if a kernel patch fixes a critical problem that affects, for example, central drivers for the network hardware. You cannot unload and reload these drivers easily to benefit from the corrected functions.
Red Hat follows a different philosophy; once a Kpatch patch is loaded, it replaces all active function calls in the kernel with the new version of the function, so that multiple versions do not exist.
The following instructions show how to run kGraft on openSUSE 13.1 and Kpatch on Fedora 20 with quite new kernels (Figure 1). Although you could follow these instructions for other distributions, you might find that the installation overhead is considerably greater.
Kpatch on Fedora
On Fedora 20, you first need to install some additional packages. The following instructions assume that all packages are up to date across the system. The kernel I used was version 3.14.3-200.fc20, the latest at the cut-off date for this issue.
For a functioning Kpatch, the following packages are required:
After installing these packages with
yum install, the
yum-builddep command makes sure the appropriate kernel headers find their way onto the system. If you want to use
ccache, you should install this package as well. Next, you should download Kpatch by mirroring the repository on your local disk and putting the modules in place:
git clone https://github.com/dynup/kpatch cd kpatch && make && make install
To test whether Kpatch is working as desired, just apply a small patch. The Red Hat example suggests modifying
/proc/meminfo so that the kernel
VMallocChunk line displays uppercase characters from now on. For Fedora 20, this is done with the patch in Listing 1.
Listing 1: Fedora 20 Patch
$ diff -ruN orig/fs/proc/meminfo.c new/fs/proc/meminfo.c --- orig/fs/proc/meminfo.c 2014-03-31 05:40:15.000000000 +0200 +++ new/fs/proc/meminfo.c 2014-05-11 16:33:19.148771809 +0200 @@ -131,7 +131,7 @@ "Committed_AS: %8lu kB\n" "VmallocTotal: %8lu kB\n" "VmallocUsed: %8lu kB\n" - "VmallocChunk: %8lu kB\n" + "VMALLOCCHUNK: %8lu kB\n" #ifdef CONFIG_MEMORY_FAILURE "HardwareCorrupted: %5lu kB\n" #endif
After editing, the
command initiates the process to build the patch. Before it can build a Kpatch module, the helper tool first builds the original kernel. In the second step, it then creates the patch. Finally, the Kpatch module is created on the basis of the delta of the two trees. This process explains why the use of
ccache makes sense.
Working with Kpatch for the first time, you might be annoyed at how long the procedure takes, compared with compiling a kernel with countless reboots for dozens of computers, the loss in time is quickly put into perspective.
At the end of the build process you will see a
Kpatch-meminfo-string.ko file. To load this into the kernel, run
sudo Kpatch load Kpatch-meminfo-string.ko
A look at
/proc/meminfo then reveals that the change worked (Figure 2). Following the same pattern, changes to other functions in the kernel should now also work.
kGraft on openSUSE
As expected, our lab revealed that the barriers to installing kGraft on openSUSE are somewhat higher (Figure 3), because kGraft cannot be built as a module for Linux but requires a specially patched kernel. Before you can use kGraft, you always need to build a kernel.
Whereas Red Hat provides detailed Kpatch installation guides in the
README.md file in the Kpatch GitHub repository, documentation for kGraft is comparatively sparse. Although SUSE provides a link  you can follow to the kGraft-patched version of the currently active SUSE kernel, this only gives you the source code; detailed instructions are not to be found (Figure 4). In the quite non-trivial process of performing openSUSE brain surgery, admins are initially left on their own.
If you do find documentation, you can look forward to an intensive bout of tinkering: After a Git checkout of developer Jiri Slaby's repository, you can start by compiling the source code. In a separate blog entry , SUSE describes how a kernel can be produced SUSE-style, but replacing the kernel is nevertheless not trivial.
Even after the successful launch of a DIY kernel, your tribulations are not over yet. As with Kpatch, it is then up to the users to build the appropriate module files with the aid of various scripts by Slaby. At the end of the process, you then need to load the results into the kernel.
It is beyond the scope of this article to deliver a detailed explanation of the steps required to build a ready-to-use kGraft system, but if you are interested in the details, check out Slaby's repository.
The idea of applying kernel patches on the fly and not needing to boot a new kernel is basically great. Whereas the overhead for a few systems is manageable, the coordinated reboot of hundreds or thousands of nodes in large deployments is a tour de force. The ability to apply remedies on the fly without much ado is a good thing, but it can take a while for the technology to reach the end user. Currently, it is still completely unclear where the journey is heading.
Advantage Red Hat
The Red Hat solution has some identifiable advantages over the SUSE idea. The fact that you do not have to patch the kernel to support live patches is certainly helpful. Also, the ability to field errors currently seems to work better in Kpatch than in kGraft.
Viewed superficially, both solutions seem to do the same thing, if you do not dig deeper into the topic. The differences only become apparent in use or during installation, which is so much easier with Kpatch than with kGraft – not to mention Kpatch's superior documentation.
The question always arises in kernel-related features as to where the solutions are headed. Both SUSE and Red Hat have announced that they want to see their live patch solution become part of the official kernel in the foreseeable future. For this to work, however, they must have the approval of Linus Torvalds, who has repeatedly made a name for himself in the past as a kernel Cerberus.
Torvalds also is known for not being amused about having multiple implementations in Linux for the same or similar features. Torvalds likely will not wave both kGraft and Kpatch past. Conceivably, one of the two solutions will prevail and the other will be ousted, or Torvalds will push for a joint venture, thus spawning a third solution combining the advantages of both systems.
If you want to try out Kpatch or kGraft today, you will not find matching packages listed in the openSUSE community version or in Fedora, which means a massive amount of manual work at the moment – for SUSE significantly more so than for Fedora.