Nuts and Bolts Matlab-Like Tools

Matlab-like tools for high-performance computing

Numbers Game

The Matlab numerical computing environment is a good candidate for HPC systems applications, but a number of free and open source Matlab-like tools are available as well: Scilab, GNU Octave, and FreeMat have a large number of built-in computational routines and are easily programmed. By Jeff Layton

A common question from people who build both large and small HPC clusters is, "What applications can I run on my HPC system?"

One of the most popular applications is Matlab [1], which many people use in their everyday work and research – either Matlab or Matlab-like tools. For example, a fairly recent blog posting from Harvard University's Faculty of Arts and Sciences, Research Computing Group [2] showed that the second most popular Environment Module [3] was Matlab.

People are using Matlab for a variety of tasks that range from the humanities, to science, to engineering, to games, and more. Some researchers use it for parameter sweeps by launching 25,000 or more individual Matlab runs at the same time. Needless to say, Matlab is used very heavily at a number of places, so it is a very good candidate for running on an HPC system.

I don't want to take anything away from MathWorks, the creator of Matlab, because their product is a wonderful application, but for a number of reasons, Matlab might not be the answer for some people (e.g., they can't afford Matlab or can't afford 25,000 licenses, they just want to try a few Matlab features, or they want or need access to the source code). This brings up the category of tools typically called "Matlab-like"; that is, they try to emulate the concept of Matlab with compatible syntax so that moving back and forth is relatively easy. When people ask what tools or applications they can try on their shiny new cluster, I tend to recommend one of these Matlab-like tools, even though they aren't strictly parallel right out of the box (so to speak).

A few tools that are somewhat Matlab-like – some still surviving and some defunct – include RLaB, RLaB+, JMathlab, and O-Matrix (commercial). A whole host of other tools exists if you want to stray from Matlab compatibility even further, such as R or SciPy; however in this article, I will talk about the open source tools Scilab, GNU Octave, and FreeMat. These tools try to be as close as possible to Matlab syntax so that Matlab code will transfer over easily, with the possible exception of Simulink [4] and Matlab GUI code. They have varying degrees of success with Matlab compatibility, and all are inherently serial applications. Serial in this case means that the vast majority of the code is executed on a single core, although some of the programs have the ability to do a small amount of parallel execution. For parallel code execution, you usually need some add-ons, such as Message Passing Interface (MPI) [5], and a code rewrite to allow multiple instances of the tool on different nodes that communicate over a network.

I won't be comparing or contrasting the tools; rather, I'll briefly present them with some pointers on how to install and use the tool, and I'll leave the final determination of which tool is "better" for your case up to you.

Scilab

Scilab is one of the oldest Matlab-like tools. It was started in 1990 in France, and in May 2003, a Scilab Consortium was formed to better promote the tool. In June 2012, the Consortium created Scilab Enterprises, which provides a comprehensive set of services around Scilab. Currently, it also develops and maintains the software. Scilab is released under a GPL-compatible license called CeCILL (see Table 1 for Scilab resources).

Tabelle 1: Scilab Resources

Resource	Location
Scilab	http://www.scilab.org/scilab/about
Scilab Enterprises	http://www.scilab-enterprises.com/
Xcos	http://www.scilab.org/scilab/features/xcos
GUI API	http://www.scilab.org/scilab/features/scilab/application_development
ATOMS	http://atoms.scilab.org/
sciGPGPU	http://atoms.scilab.org/toolboxes/sciGPGPU
OpenCL code	http://forge.scilab.org/index.php/p/sciCuda/
Wiki	http://wiki.scilab.org/
Matlab to Scilab	http://wiki.scilab.org/MatlabToScilab
Intro to Scilab (PPT)	http://www.heikell.fi/downloads/scilab.ppt
Linalg performance	http://wiki.scilab.org/Linalg%20performances
Compiling	http://wiki.scilab.org/Compiling%20Scilab%205.x%20under%20GNU-Linux%20Unix
Parallel computing	http://wiki.scilab.org/Documentation/ParallelComputingInScilab
`parallel_run`	http://help.scilab.org/docs/5.4.0/en_US/parallel_run.html
Parallel programing	http://my.opera.com/muksitsyahlan/blog/2011/01/05/parallel-programming-with-scilab-2
MPI code	http://gitweb.scilab.org/?p=scilab.git;a=shortlog;h=refs/heads/MPI

Prepackaged versions of Scilab exist for Linux (32-bit and 64-bit); Mac OS X; and Windows XP, Vista, and Windows 7, along with, of course, the source code. These packages include all of Scilab, including something called Xcos, which corresponds to Simulink from MathWorks. Scilab is the only open source Matlab-like tool to include something akin to Simulink. Scilab also comes with both 2D and 3D visualization, extensive optimization capability, statistics, control system design and analysis, signal processing, and the ability to create GUIs by writing code in Scilab. You can also interface Fortran, C, C++, Java, or .NET code to Scilab. Installing Scilab on Linux is easy with either one of the precompiled binaries: 32- or 64-bit. I downloaded the 64-bit binary (a tar.gz file), and untarred it into /opt. This produces a subdirectory /opt/scilab-5.4.0 (the latest version as I wrote this). To run Scilab, I used:

/opt/scilab-5.4.0/bin/scilab

which brought up the Scilab GUI tool (Figure 1). The console in the middle of the figure accepts commands; a file browser is on the left, a variable browser at top right, and a command history on the bottom right. It also has a very nice built-in text editor called "SciNotes" (Figure 2), which can be used to write code.

Scilab's innovative Variable Browser lets you edit variables, including those in matrices, using something like a spreadsheet tool. When you first bring up the editor, it displays a list of the variables in the current workspace (Figure 3).

Figure 3: Scilab Variable Browser window.

When you double-click on a variable, you call up the variable editor to edit the values. For example, double-clicking on variable A brought up the spreadsheet-like view shown in Figure 4. At this point, I can edit any value for any entry of A.

Figure 4: Scilab Variable Editor – editing variable A.

A "Modules" capability adds extra functionality to Scilab. Much like the "toolboxes" of Matlab, Scilab keeps modules at a website called ATOMS (AuTomatic mOdules Management for Scilab). One of the most critical modules for HPC is probably sciGPGPU, which provides GPU computing capabilities. Using sciGPGPU within Scilab is relatively straightforward, but you need to know something about GPUs and CUDA [6] or OpenCL [7] to use it effectively. Listing 1 shows a code snippet taken from the main sciGPGU site that illustrates how to use the cuBLAS library [8]. (You can also use the cuFFT library [9], but sample code for it is not shown.)

Listing 1: Scilab GPU Code Using sciGPGPU

01 stacksize('max');
02 // Init host data (CPU)
03 A = rand(1000,1000);
04 B = rand(1000,1000);
05 C = rand(1000,1000);
06
07 // Set host data on the Device (GPU)
08 dA = gpuSetData(A);
09 dC = gpuSetData(C);
10
11 d1 = gpuMult(A,B);
12 d2 = gpuMult(dA,dC);
13 d3 = gpuMult(d1,d2);
14 result = gpuGetData(d3); // Get result on host
15
16 // Free device memory
17 dA = gpuFree(dA);
18 dC = gpuFree(dC);
19 d1 = gpuFree(d1);
20 d2 = gpuFree(d2);
21 d3 = gpuFree(d3);

Scilab has a vibrant community, and the excellent Scilab wiki has a very good section on migrating from Matlab to Scilab. At this site, an extensive PDF discusses differences between Matlab and Scilab and how to change your Matlab code, if it needs to be changed, to run on Scilab.

An additional excellent Scilab resource is a PowerPoint presentation by Johnny Heikell of 504 slides (at last count), which introduces Scilab and how to use it. Heikell also shows how to convert Matlab files to Scilab files. Keep in mind that the downloadable Scilab binaries are built to be as fast as possible, yet still be transportable.

Because performance is extremely important in HPC, you might want to build Scilab yourself . This process would allow you to include Intel's MKL library [10], to get the fastest possible BLAS and FFT operations for Intel processors, or ACML (AMD Core Math Library) [11], which is used to tune AMD processors. Be sure to read all of the details on building Scilab at the wiki site; the GUI portion of Scilab requires Java.

GNU Octave

The GNU Octave project was conceived by John W. Eaton at the University of Wisconsin-Madison as a companion to a chemical reactor course he taught. Serious design of Octave, as it was first called, began in 1992, with the first alpha release on January 4, 1993, and the 1.0 release on February 17, 1994. In 1997, Octave became GNU Octave (starting with version 2.0.6). From the beginning, it was published under the GNU GPL license – initially, the GNU GPLv2 license but later switched to the GNU GPLv3 license.

For the rest of this article, I will refer to GNU Octave as just Octave. Like Scilab and Matlab, Octave is a high-level interactive language for numerical computations. Its language is very similar to, but slightly different from, Matlab. It comes with a large number of functions and packages and uses Gnuplot for plotting and visualization.

Octave is popular and widely used, perhaps partly because it is part of GNU, so it is commonly built for Linux distributions. However, I also think it is widely used because the basic syntax is close to Matlab, and it is open source. Some differences between Octave and Matlab are explained in the Octave wiki, a FAQ on porting, a table of key differences, and a wikibook (see Table 2 for Octave resources).

Tabelle 2: GNU Octave Resources

Resource	Location
GNU Octave	http://www.gnu.org/software/octave/
Gnuplot	http://www.gnuplot.info/
Wiki	http://en.wikipedia.org/wiki/GNU_Octave
FAQ	http://wiki.octave.org/FAQ
Matlab/Octave differences	http://www.ece.ucdavis.edu/%7Ebbaas/6/notes/notes.diffs.octave.matlab.html
Programming differences between Matlab and Octave	http://en.wikibooks.org/wiki/MATLAB_Programming/Differences_between_Octave_and_MATLAB
SourceForge	http://octave.sourceforge.net/
Toolkits	http://octave.sourceforge.net/packages.php
HDF5	http://www.hdfgroup.org/HDF5/
Introduction	http://www-mdp.eng.cam.ac.uk/web/CD/engapps/octave/octavetut.pdf
JIT	http://jit-octave.blogspot.com/
Build with MKL	http://software.intel.com/en-us/articles/using-intel-mkl-in-gnu-octave
Build with ACML	http://luiseth.wordpress.com/2012/04/08/accelerate-your-matrix-computations-with-acml-on-kubuntu-11-10/
Parallel toolbox	http://octave.sourceforge.net/parallel/
`parcellfun`	http://octave.sourceforge.net/general/function/parcellfun.html
`openmpi_ext`	http://octave.sourceforge.net/openmpi_ext/index.html

A huge number of additional toolkits for Octave (the same concept as a Matlab toolbox) are available at Octave-Forge. One thing you do need to note is that files from Matlab Central's File Exchange [12] cannot be used in Octave, as explained in the Octave FAQ.

Octave is easy to install because your favorite distribution probably has it available. In my case, I use Scientific Linux 6.2 (Listing 2). After installing Octave, I had one small problem to solve: The HDF5 libraries couldn't be found. I added a line to my .bashrc file so the library was in LD_LIBRARY_PATH:

Listing 2: Excerpt of Octave Install on SL6.2

[root@test1 laytonjb]# yum install octave
...
Dependencies Resolved
=====================================================================================
 Package                         Arch      Version              Repository     Size
=====================================================================================
Installing:
 octave                          x86_64    6:3.4.3-1.el6        epel           9.1 M
Installing for dependencies:
 GraphicsMagick                  x86_64    1.3.17-1.el6         epel           2.2 M
 GraphicsMagick-c++              x86_64    1.3.17-1.el6         epel           103 k
 blas                            x86_64    3.2.1-4.el6          sl             320 k
 environment-modules             x86_64    3.2.7b-6.el6         sl              95 k
 fftw                            x86_64    3.2.2-14.el6         atrpms         1.6 M
 fltk                            x86_64    1.1.10-1.el6         atrpms         375 k
 glpk                            x86_64    4.40-1.1.el6         sl             358 k
 hdf5-mpich2                     x86_64    1.8.5.patch1-7.el6   epel           1.4 M
 mpich2                          x86_64    1.2.1-2.3.el6        sl             3.7 M
 qhull                           x86_64    2010.1-1.el6         atrpms         346 k
 qrupdate                        x86_64    1.1.2-1.el6          epel            79 k
 suitesparse                     x86_64    3.4.0-2.el6          epel           782 k
 texinfo                         x86_64    4.13a-8.el6          sl             667 k
Transaction Summary
=====================================================================================
Install      14 Package(s)
Total download size: 21 M
Installed size: 81 M
Is this ok [y/N]: y
...
Installed:
  octave.x86_64 6:3.4.3-1.el6
Dependency Installed:
  GraphicsMagick.x86_64 0:1.3.17-1.el6       GraphicsMagick-c++.x86_64 0:1.3.17-1.el6
  blas.x86_64 0:3.2.1-4.el6                  environment-modules.x86_64 0:3.2.7b-6.el6
  fftw.x86_64 0:3.2.2-14.el6                 fltk.x86_64 0:1.1.10-1.el6
  glpk.x86_64 0:4.40-1.1.el6                 hdf5-mpich2.x86_64 0:1.8.5.patch1-7.el6
  mpich2.x86_64 0:1.2.1-2.3.el6              qhull.x86_64 0:2010.1-1.el6
  qrupdate.x86_64 0:1.1.2-1.el6              suitesparse.x86_64 0:3.4.0-2.el6
  texinfo.x86_64 0:4.13a-8.el6
Complete!

export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib64/mpich2/lib/"

To run Octave, I simply enter octave at the command prompt.

Right now, Octave is a command-line-driven tool without a standard GUI. Several attempts have been made at a GUI, but none have been successful enough to be included with Octave. You can read more about it in the Octave FAQ. Octave also can use gnuplot to plot results and visualize data. Figure 5 is an example of a 3D plot borrowed from "Introduction to GNU Octave" [13] that shows the commands used to create a plot. Octave creates a new window with the resulting plot, as shown in Figure 6.

Figure 5: Octave console window for a 3D plot.

A number of sites have introductions and examples of Octave, and a good place to start is the Octave wiki or a slightly dated Introduction to Octave PDF (see Table 2), which is nevertheless still a valuable resource for help getting started with Octave.

Recently, an effort has been made to create a JIT (Just In Time) compiler for Octave. It is a work in progress and not quite ready for production, but you can read about the goals and possibly experiment with it. Be warned that work on the JIT has not progressed for a few months, but I'm hoping it doesn't become another dead Octave project.

As with Scilab, the downloadable binaries for Octave that come with your distribution are likely to be the least common denominator in terms of performance, but building Octave is fairly easy.

Intel provides a set of instructions on how to build Octave using MKL, and a blog post tells you how to build Octave with ACML for AMD processors (it's for Ubuntu, but the principles are the same). To make things a little more generic, you can also use OpenBLAS [14] to build Octave. Some efforts have been made to run some Octave functions on GPUs; however, adding GPU capability to Octave is not likely to happen any time soon.

To be honest, I don't completely understand the issues, but it involves license issues because the GPU GPLv3 licenses are not compatible with licenses for various GPU tools and languages (CUDA in particular). I hope this will be resolved in the future, but in my opinion, it really hurts Octave's applicability in HPC.

FreeMat

A more recent development effort for a Matlab-like tool is called FreeMat. The intention is to develop an interactive numerical environment that is similar to both Matlab and IDL. FreeMat has prebuilt binaries for Windows, Mac OS X, and Linux and is released under the GPL.

FreeMat follows the same lines as Scilab and Octave, and the language is fairly close to Matlab. The FreeMat FAQ has a short section on the differences between FreeMat and Matlab that should help you take Matlab code and run it with FreeMat (see Table 3 for FreeMat resources).

Tabelle 3: FreeMat Resources

Resource	Location
FreeMat	http://freemat.sourceforge.net/
FAQ	http://freemat.sourceforge.net/#faq
Primer	http://www.floss4science.com/new-freemat-4-user-guide/
Numerical methods	http://www.ohio.edu/people/tc285202/nmethods.pdf
Parallelization plans	http://code.google.com/p/freemat/wiki/ParallelizationPlans
Threads	http://freemat.sourceforge.net/help/sec_thread.html

I tried installing an FC14 (Fedora Core 14) version of FreeMat 4.x on my Scientific Linux 6.2 system using rpm to install it and yum to help resolve dependencies, but I received errors that I could not resolve, and it failed, so I tested FreeMat on a Windows 7 system. Figure 7 shows the FreeMat console with a few commands. The window looks similar to Scilab and, to some degree, Matlab.

A console appears on the right, and the stacked windows on the left are the file browser, history, variable list, and debug windows. The figure shows that the simple AC=B works just the same as in Matlab, Scilab, and Octave. FreeMat can also do some reasonable graphics. Figure 8 shows a plot of the simple 3D plot example taken from the FreeMat help site.

The FreeMat site has a good introduction to the software, and you can find a FreeMat Primer on the FLOSS for Science website. A good introduction to FreeMat is combined with a discussion of basic numerical methods, as well. The PDF is incomplete by a few pages, but it does get you started with FreeMat.

Going Parallel

Matlab-like tools are extremely useful in HPC, even though they are serial applications. As I mentioned previously in this article, Matlab and Matlab-like tools can be used for tasks such as parameter sweeps by running something like 25,000 simultaneous instances of the application. However, in other situations, you might want to run the underlying functions in parallel.

For example, you might want to perform a large FFT or a large SVD (single-value decomposition) as quickly as possible by running the application using all of the cores in the node, or even by running the computations across several distributed nodes.

Several parallel processing options for Scilab are summarized in the Scilab parallel computing documentation. The first option is to use the inherent multicore capabilities in the functions used in Scilab.

For example, certain libraries perform the linear algebra computations in Scilab, and these libraries could perform the computations using all of the cores in the system. Intel's MKL library can use all of the cores for performing matrix multiplications or other functions. Typically, this is done using OpenMP but not necessarily. However, these computations are limited to intrinsic functions, so you can't parallelize Scilab code such as a for loop.

Scilab also has the capability of running more explicit parallel applications on multicore systems (i.e., cores on the same node). A function called parallel_run allows parallel calls to a function. This allows you to parallelize function calls on the system – but remember that the execution is on a single node (but with four-socket AMD systems, you can get 64 cores on a single system).

For parallel distributed applications on Scilab, you can also use PVM (Parallel Virtual Machine). PVM is a rather old approach to parallel programming and has given way to MPI (Message Passing Interface) for the most part, but it is still used in some areas. A good blog post discusses how to use PVM within Scilab (but it is two years old by now). A Git repository holds some early code developed by Scilab Enterprises to create MPI capability for Scilab.

In a manner similar to Scilab, Octave can also use numerical libraries that have been parallelized to run on a single node, such as Intel's MKL or something similar, perhaps using OpenMP. You just have to build Octave yourself and use the appropriate libraries.

Octave also has a parallel toolbox to use for running applications on a cluster or a distributed system, and with the parcellfun command, you can execute parallel function calls on the same node. This is very similar to Scilab's parallel_run command.

The openmpi_ext toolbox uses MPI to allow Octave instances on different nodes to communicate and share data. It requires the use of Open MPI [15], but if you have experience in HPC, it isn't difficult to build and install.

Parallel coding in FreeMat is a little more difficult. Evidently, early versions of FreeMat could use MPI for parallel coding; however, it appears this work has not been continued in the current versions of FreeMat.

One interesting FreeMat feature is the use of threads within the language. FreeMat threads can communicate with each other through the use of global variables. Although I have not tested this feature, it appears to be in the current versions.

Summary

In this article, I briefly reviewed three Matlab-like tools: Scilab, Octave, and FreeMat. All three have pluses and minuses that can be debated, but the one you choose ultimately depends on your requirements. For further comparison of these tools, check out the technical report from the University of Maryland [16].

If you need a general-purpose numerical tool for HPC, any one of these tools is a good candidate. If you are willing to stray further from Matlab compatibility, other candidates could work as well, but that is the subject of another article and likely another series of debates. In the meantime, give one of these applications a whirl – I think you'll like what you see.