Features Interview: Dan Rosenberg

From Out of the Blue

Bug Hunter

We sat down with Dan Rosenberg, security consultant and kernel bug hunter, to talk about code auditing, kernel fuzzing, and getting started in the security business. By Kurt Seifried

Kurt Seifried: Who are you? I mean this is in the sense that you seem to have come out of nowhere and started submitting a lot of good Linux Kernel security bugs.

Dan Rosenberg: I'm currently employed as a security consultant at Virtual Security Research [1] in Boston, Massachusetts. I came to VSR immediately after completing my Master's degree in computer science. I've been involved in security for several years now, but it's only been in the past two or so years that I really turned my attention to vulnerability research and exploit development. Now that I've found an area that I'm passionate about, I think I'll stay awhile. :)

Kurt Seifried: Why did you choose the Linux kernel to be the apparent focus of your efforts?

DR: I started auditing the Linux kernel because I use it for both personal use and during penetration testing. It's rewarding to be able to contribute to a project that I use every day. Also, I don't think enough attention is given to the importance of kernel security – in many environments, the whole security model topples over if you're running a vulnerable kernel.

Kurt Seifried: When auditing the Linux kernel what tools or techniques are you using specifically?

DR: The vast majority of Linux kernel vulnerabilities I've reported have been found with manual code auditing. Although there might be a slight initial learning curve in becoming familiar with the various subsystems and how they fit together, once an auditor has that basic knowledge, finding bugs becomes as simple as understanding which components are particularly exposed to unprivileged input and knowing what sorts of coding and design practices tend to introduce vulnerabilities.

Areas such as the networking and filesystem subsystems and various system call interfaces are always going to be worthwhile targets, because they provide so many interfaces with which unprivileged users can interact and potentially break things. Most of my auditing sessions don't require any fancy tools – I usually just stick to a text editor, grep, and a cross-referencer (either cscope [2] or LXR [3]).

Although automated source code vulnerability scanners have their place, especially in auditing very large code bases, I haven't found them to be particularly useful when going after the Linux kernel. They tend to be most effective at finding the low-hanging fruit, so to speak, or highlighting code paths that might be poorly written and may warrant further manual auditing.

I know Coverity has been used on the Linux kernel before (prompting Linus Torvalds' often-mocked announcement with the release of the 2.6.11 kernel that "it's now officially all bug-free" [4]), but in my opinion, automated scanners aren't currently capable of understanding the semantics of complex code like the Linux kernel and, as such, aren't able to identify vulnerabilities more esoteric than common programming mistakes such as buffer overflows. Also, there's the issue of having to triage huge numbers of false positives to identify a much smaller handful of real bugs.

Kurt Seifried: Which tools/techniques do you find gives you the most bang for the buck and/or allows you to find the largest volume of serious vulnerabilities?

DR: Code auditing currently gives me the best bang for the buck in terms of both volume and severity, but perhaps this will change as static analysis and fuzzing technologies improve.

Kurt Seifried: What do you think of fuzzing tools, are they applicable to the Linux kernel in general, or are they primarily going to be limited to testing things like filesystems for now?

DR: Kernel fuzzing is an area where I hope to see more development in the future. I do a lot of fuzzing on userland targets, and in that space, it's been very useful in generating high-volume, high-severity vulnerabilities. The challenge with fuzzing the kernel, however, is that the interfaces aren't nearly as well-defined, and basic sanity checks tend to be performed very early in most call paths. As a result, blindly sending malformed input via system calls is unlikely to reach deeper, potentially vulnerable code paths, because obvious checks will fail early on.

To address this, some people have begun creating fuzzers that are a bit more intelligent about understanding what format is expected as input. Tavis Ormandy has done some nice work in this area with his "iknowthis" [5] fuzzer. I think these techniques will ultimately yield much better results in identifying vulnerabilities, but I'd consider this work to still be in its infancy.

Kurt Seifried: Speaking of filesystems, would it be fair to say that they don't appear very robust in the face of an attacker? Why or why not?

DR: It all depends on your threat model. Most non-experimental filesystems are fairly robust in handling anything thrown at them during usage. However, one area that definitely needs improvement is handling the mounting of corrupt filesystem images. Many filesystem implementations immediately crash or behave erratically in these scenarios. Fortunately, this isn't really a feasible attack vector in most cases, because it would require a victim to download a corrupted filesystem and manually mount it, but the tendency of Linux distributions to enable things like automatic mounting of USB devices may make this slightly more interesting to attackers with physical access.

Kurt Seifried: Do you think people will start building "kernel" fuzzers – that is, fuzzing internal kernel structures in memory or signal handling?

DR: The problem I see with fuzzing kernel structures in memory is the difficulty of guaranteeing that you're actually causing a state that can be reproduced in normal circumstances by unprivileged users. In other words, even if you can cause the kernel to crash when you mutate a certain kernel structure, that isn't worth anything unless you can show how that mutation could be caused by an attacker. As a result, I think improving fuzzers that actually rely on existing interfaces would be a more useful immediate goal.

Kurt Seifried: Let's talk about "full-nelson.c" [6] for a moment. How did this exploit come about?

DR: I follow LKML (the Linux kernel mailing list [7]), so I saw Nelson Elhage's original report for the primary issue (the failure to revert an address limit override in an OOPSed thread's exit path). I thought it was a great find, and the fact that it would allow chaining several vulnerabilities to gain privileges was appealing to me.

Over the next few days, I chatted with Nelson about it, and we both started developing exploits independently. Two other issues he had recently reported – a null pointer dereference in the Econet protocol and a missing permissions check in Econet – were perfect candidates for providing a necessary trigger to exploit the address limit override vulnerability.

In general, I think publishing exploits is a good way to raise awareness of the consequences of vulnerabilities that may have been underreported, as well as keep the public up to date on some of the techniques real attackers are using when developing these kinds of exploits.

I used Nelson's bugs as an opportunity to test a new approach toward exploit publication, where I published a somewhat neutered version of the exploit that clearly demonstrated the impact and details of the issue but wouldn't be immediately usable to so-called "script kiddies" who use these exploits illegally. I received some positive feedback on this approach, so perhaps I'll use it again in the future.

Kurt Seifried: What advice would you give someone who wants to start in the security industry, specifically in the bug hunting business. Are there any specific books (like Gray Hat Python) or websites (like Packet Storm or Exploit Database) that you would recommend? It seems like most people who do source code auditing are largely self-taught beyond the basics.

DR: Even when you narrow things down to bug hunting, this is a huge topic. A basic background in programming and networking fundamentals is essential, regardless of your preferred techniques of finding vulnerabilities. Beyond that, the most important things are persistence and an eagerness to learn. Practicing bug hunting and exploitation in war games, such as SmashTheStack [8], may provide a good introduction to the field before you start to look in real software. Subscribing to various industry mailing lists (Full Disclosure and Bugtraq come to mind) is a good idea.

Whenever an interesting vulnerability or exploit becomes public, study it until you understand the techniques that were used to find or exploit the issue. By learning to recognize the kinds of practices that lead to vulnerabilities, you'll be better equipped to apply this knowledge in your own searches. Also, keeping track of the latest research in vulnerability discovery and exploit development is crucial, because the field changes very rapidly. Twitter is a great resource for this – you'd be surprised how much top-quality research is shared this way. Once you have an idea of the current state of the field, practice makes perfect: You'll never find vulnerabilities if you don't look.

In terms of reading, I really enjoyed The Art of Software Security Assessment [9], which is an excellent introduction to code auditing. As far as fuzzing goes, I highly recommend starting with simple random mutation fuzzers such as zzuf [10], which can yield a surprising number of vulnerabilities in a short amount of time. Many of the best "smart" (protocol- or format-aware) fuzzers, such as Sulley [11] or Peach [12] are well documented and can be learned by downloading and going through the provided reading materials. If you're interested in kernel exploitation in particular, A Guide to Kernel Exploitation [13] is an awesome read.

Kurt Seifried: If you could change any one thing about the Linux kernel, ignoring cost to implement backward compatibility and so on, what would it be and why?

DR: If I had my way, the Linux kernel would include the PaX [14] project's KERNEXEC and UDEREF features. KERNEXEC implements proper page protections, enforcing read-only properties for critical sections such as the interrupt descriptor table (IDT), global descriptor table (GDT), and system call table, which are popular targets for kernel-write vulnerabilities. UDEREF solves the general problem of kernel-to-userland data accesses, including the well-known subset of null pointer dereferences.

Linux on x86 features a memory model in which kernel and userspace addresses both reside in every process's address space for performance reasons. So, if an attacker has a kernel vulnerability, it's possible to simply map an executable payload in userland and redirect the control flow of the kernel to execute the code in the process's userland mapping. UDEREF prevents this type of access and all other attacks in which kernel code attempts to access data residing in userland improperly by leveraging the segmentation features of the x86 architecture. This feature has not been accepted (or even submitted) upstream because it's unlikely the performance penalties would be acceptable to the kernel maintainers; however, the security benefits would be huge.

Kurt Seifried: Do you think security techniques like disabling the loading of Linux kernel modules have any significant benefits in terms of security?

DR: Absolutely. Most distributions compile modules for hundreds of drivers, filesystems, and networking protocols. In many cases, unprivileged users can cause the automatic loading of these modules, for example, by opening a socket using an obscure networking protocol. Especially in the networking case, the ability to load these modules creates a huge attack surface. Attackers now can find a vulnerability in the most poorly written, obscure networking protocol, and it will affect nearly every distribution, despite the fact that 99.9% of users will never use that protocol.

I've submitted patches to prevent automatic loading of modules in this way, but they were not accepted because of some implementation issues. I hope this area will see some improvement in the future. Distributions such as Ubuntu and Debian are finally starting to see the risk posed here and are explicitly disabling support for certain obscure networking protocols that have had vulnerabilities in the past.