Nuts and Bolts Performance Tuning Dojo 

Even benchmarks can be easy to handle

Stress Test

We show how to perform quick and easy benchmarks using stress. By Federico Lucifredi

Benchmarks are finely tuned mechanisms, fraught with complexity and exceptions users must be aware of, lest the results be completely and utterly meaningless – with one exception: pushing a device to 100% of its capacity on a single metric. This simple, easily understood measurement is remarkably portable across machines and nearly foolproof. If used correctly, single-metric benchmarks are a quick tool for debugging and tuning, providing objective and easy-to-obtain numbers that put an upper limit on what a system can accomplish under ideal conditions. This approach lets you pinpoint what a system's real-world performance cannot possibly exceed. It's an ideal test, even when more sophisticated approaches are needed to forecast the performance of an actual workload.

My original aim was to measure the power consumption of a Plug Computer [1] in its idle state (as defined by the manufacturer) and when running at full throttle. Many embedded devices run Linux these days, so this is not an uncommon scenario. When I was on the verge of reinventing the wheel, a colleague in the Ubuntu QA team pointed out a highly portable tool suitable for my aims: stress [2].

Stress (easily found in the eponymous Ubuntu universe package) is designed to put load on a system to stress-test it. Stress is very portable: A single C file is all you need to compile when your distribution's packaging fails you – not a small consideration when you are looking at different architectures in your tests as I am. Stress is also portable across the *nix family, being equally at home on Linux, BSD, Mac, and a variety of Unix platforms.

As you go through stress's options, you can visualize its effect on the system via Byobu's [3] status line, the Gnome System Monitor [4], the handy top [5], or Apple Activity Monitor on a Mac[6], among others.

Starting with the CPU, you can task multiple process forks to compute square roots endlessly. By launching one fork for each core of your system, you can ensure 100% system load:

$ time stress --cpu 2
stress: info: [3855] dispatching hogs: 2 cpu, 0 io, 0 vm, 0 hdd
real    0m53.820s
user    0m47.143s
sys     0m0.000s

This load is enough to take the program's share of the CPU to 98%. Because the system was idle, the two CPU hog tasks are taking up all available capacity on either processor core (Figure 1).

CPU load.
Figure 1: CPU load.

You can also target specific load averages for the system. Because these processes will always be in either running or runnable state, just add the desired number of hogs, account for the number of system cores, and wait for /proc/loadavg to converge. Of course, the CPU governor and power-saving settings will go on holiday.

Multiple load-generating tasks ("hogs") can also be combined in the same command. You can create memory pressure in the system by launching eight RAM hogs:

stress -m 8 --verbose

These tasks malloc 256MB and touch a byte every 4096 bytes, which dirties each memory page and forces the kernel to allocate physical resources to it. The resources are then freed and allocated again – generating quite a bit of pressure on the memory allocator. To hold the allocation instead of cycling it to remove resources from the system, use --vm-hang SECONDS:

stress -m 8 --vm-hang 0

Free RAM drops by 2GB instantly as you run this. You might want to use the --timeout option to ensure that the program quits automatically if your system were to become unresponsive.

I hope you are inspired to try a quick and easy stress test to determine whether your project is possible – before all the complex interrelationships and more sophisticated benchmarks complicate the picture. You may save considerable time and frustration in the process.