Creating SmartOS zones using UCARP
In the Zone
SmartOS [1] is an open source and free Unix-like operating system based on the Illumos kernel, a fork of the discontinued OpenSolaris kernel. The main purpose of SmartOS is to run virtual machines, so it offers KVM or zones (also known as Solaris containers) as virtualization technologies.
In the philosophy of SmartOS, which is defined as a cloud platform or a virtualization platform, concepts like high availability (HA), data distribution, filesystem replication, high reliability, and so on, should all be delegated to the application layer.
This approach means that, except through laborious tricks, objects like shared storage for virtual machines (VMs) are not supported – the VMs' disk images cannot reside on iSCSI or NFS disks. Additionally, there are no distributed filesystems (e.g., GlusterFS or OCFS) or distributed block devices (e.g., DRBD), and you can forget about VMs' virtual IP addresses or load balancers (e.g., Keepalived or Linux IPVS) managed by the hypervisor.
Thus, every physical server does not share anything with other physical servers. In other words, there is no mechanism to achieve high availability from the hypervisors (e.g., using Heartbeat or Pacemaker) to start a VM on another hypervisor in case of failure of the server hosting it.
If you want high availability or high reliability, you have to be modern. You should learn concepts and technologies like Node.js, cloud storage (DS3), Lucene, MongoDB, Riak, REST, SaaS, and so on. Additionally, you have to forget the old-fashioned IP address bound to a specific service or welded to a specific machine providing a particular service. You probably have to rethink your applications.
I don't want to say that if you need a virtual machine running an HTTP server or you need a MySQL database or any other service, you cannot use SmartOS – go ahead! SmartOS is great. I just want to say that SmartOS was born as a modern operating system that tries to break away from old concepts and old habits. However, it is not always easy to abandon the old ways of doing things.
How to Achieve "Old-Fashioned" HA
That said, inside a KVM virtual machine, you can do what you want: You obviously can install Linux, then you can install DRBD, Pacemaker, virtual IP (VIP) address management systems, GFS, and so on. Eventually, you must take into account performance.
In the same way, if you use SmartOS zones, it is possible to achieve old-fashioned IP high availability. By the way, SmartOS zones have more than 12,000 packages available, coming from the pkgsrc framework [2].
On the SmartOS community wiki, you can find instructions on how to use the Wackamole package [3]. However, I had many issues using Wackamole, so in this article, I'm going to explain how to use UCARP [4], instead. Learning how to configure and use UCARP is useful, because it is available and widely used on a number of operating systems (e.g., Linux, FreeBSD, OpenBSD, Mac OS X, NetBSD), so you can reuse elsewhere what you learn on SmartOS.
What Is UCARP?
From the UCARP man page: "UCARP allows a pair of hosts to share common IP addresses in order to provide automatic failover of an address from one machine to another. It is a portable userland implementation of the secure and patent-free Common Address Redundancy Protocol, (CARP), OpenBSD's alternative to VRRP."
The project website states: "Strong points of the CARP protocol are: very low overhead, cryptographically signed messages, interoperability between different operating systems, and no need for any dedicated extra network link between redundant hosts."
Getting Started
Suppose you want to configure two highly available zones (containers) on two different physical servers, or rather hypervisors, or else Global Zones (GZ) as in the SmartOS/Illumos/Solaris terminology. This is the IP assignment for this example:
zoneA real IP 192.168.0.201 zoneB real IP 192.168.0.202 virtual IP (VIP) 192.168.0.200
On each GZ, you need to install the latest SmartOS standard64 dataset (the UCARP package was buggy in some old releases of pkgsrc; releases 2014Q2 and 2014Q1 are fine):
imgadm install d34c301e-10c3-11e4-9b79-5f67ca448df0
To better understand, take a look at an article published in a previous issue of ADMIN [5].
First, I'll create a JSON file called ucarptest1.json
to configure the zoneA node (Listing 1). In the other GZ, I'll create a JSON file called ucarptest2.json
for the configuration of the zoneB node (Listing 2). It is equal to the other file, except for the alias, hostname, and IP.
Listing 1: ucarptest1.json
01 { 02 "alias": "zoneA", 03 "hostname": "zoneA", 04 "brand": "joyent", 05 "dns_domain": "yourdomain.com", 06 "dataset_uuid": "d34c301e-10c3-11e4-9b79-5f67ca448df0", 07 "resolvers": [ 08 "192.128.0.9", 09 "192.128.0.10" 10 ], 11 "max_physical_memory": 4096, 12 "nics": [ 13 { 14 "ip": "192.168.0.201", 15 "netmask": "255.255.255.0", 16 "gateway": "192.168.0.201", 17 "primary": true, 18 "allow_ip_spoofing": true, 19 "allow_mac_spoofing": true 20 } 21 ] 22 }
Listing 2: ucarptest2.json
01 { 02 ... 03 "alias": "zoneB", 04 "hostname": "zoneB", 05 ... 06 "nics": [ 07 { 08 ... 09 "ip": "10.96.11.202", 10 ... 11 "allow_ip_spoofing": true, 12 "allow_mac_spoofing": true 13 } 14 ] 15 }
Please note that it is very important to include the following two options: allow_ip_spoofing
and allow_mac_spoofing
. These options are mandatory if you want to use systems like UCARP. By default, SmartOS doesn't allow (in other words, it blocks) rogue IPs that are not specified in the zone configuration to prevent users from configuring illegitimate IPs or MAC addresses from inside a zone.
Next, you can create the two zones – one on each GZ. (Note that you can configure the highly available zones on the same GZ if you like.)
[root@gz1 ~]# vmadm create -f ucarptest1.json [root@gz2 ~]# vmadm create -f ucarptest2.json
Log in (using the zlogin
command) on each of the two zones and install the ucarp package; libpcap will be installed to satisfy dependencies.
pkgin update pkgin install ucarp
Once the package is installed, you will notice that the UCARP package lacks some files. In fact, when using the SmartOS operating system and zones, you should take into account that, unlike Linux, you will not see a lot of package maintainers. Thus, you often need to employ various tricks during configuration; you are welcome to report package-related issues [6].
As you can see from the result of
svcs -a | grep -i ucarp
the Service Management Facility for the UCARP daemon does not exist. Thus, the first thing to do is configure SMF by creating and importing an XML file. Some tools, such as manifold
[7], are available to help you create service manifest files from scratch. However, you can use a ready-made manifest: Just download it [8] and import it using the command:
svccfg import ucarp.xml
Now, if you issue svcs ucarp
, it reports the information shown in Listing 3, and that's OK.
Listing 3: Status Information
STATE STIME FMRI disabled 10:56:31 svc:/network/ucarp:default
As I mentioned previously, there are no configuration files or startup/shutdown scripts, so I will create them in /opt/local/etc/ucarp.cfg
(see Listing 4 for nodeA). The nodeB configuration file (Listing 5) should be equal to the nodeA configuration, except for the RIP (real IP of the zone), which is the IP used to communicate on the network.
Listing 4: NodeA Configuration File
01 --interface=net0 \ 02 --srcip=192.168.0.201 \ 03 --vhid=101 \ 04 --pass=changeme \ 05 --addr=192.168.0.200 \ 06 --upscript=/opt/local/etc/vip-up.sh \ 07 --downscript=/opt/local/etc/vip-down.sh \ 08 -z
Listing 5: NodeB Configuration File
01 ... 02 --srcip=192.168.0.202 \ 03 ...
Note the --vhid
(virtual host ID) and --pass
(password) options. They must be equal on each node of the same cluster. On the other hand, this ID must be unique on each UCARP cluster working on the same network (i.e., you need a different VHID if other nodes are sharing another virtual IP.
The --addr
option is used to specify the VIP address – the IP that must float in case of failover. Specifically, the configuration works as follows: Suppose you have nodeA as the MASTER node and nodeB as the STANDBY or BACKUP node. At some point, you shut down nodeA, so nodeB becomes the MASTER. Once you start nodeA again, this node assumes the STANDBY role; it doesn't assume the previous state. In this way, you will prevent continuous flapping of the service.
There are, of course, other kinds of configurations, such as preemption (-P
option) or preferred master topology. Additionally, you can configure more than two nodes and use some sort of weighting for one node to be the master preferentially over another (-k
option). Finally, the -z
option tells the daemon to invoke downscript
on service shutdown.
Now, as you can see from the configuration file, to really configure the VIP you need to create the upscript
and the downscript
scripts. As you can guess, upscript
will be invoked when the node successfully becomes the MASTER, and downscript
will be invoked when the node has transitioned to the STANDBY state, or when you stop the UCARP service (as well as when the zone is halted).
So, upscript
must contain commands that will configure the VIP on a virtual network interface, and by contrast, downscript
will unconfigure the VIP and delete the network interface. In the configuration file, the upscript
option points to /opt/local/etc/vip-up.sh
(Listing 6) and downscript
points to /opt/local/etc/vip-down.sh
(Listing 7).
Listing 6: /opt/local/etc/vip-up.sh
01 #!/bin/bash 02 03 ifconfig "$1":1 plumb 04 ifconfig "$1":1 $2 netmask 255.255.255.255 05 ifconfig "$1":1 up
Listing 7: /opt/local/etc/vip-down.sh
01 #!/bin/bash 02 03 ifconfig "$1":1 down 04 ifconfig "$1":1 unplumb
As you can see, and as you can read in the man page, the upscript
and the downscript
options will implicitly pass some arguments to the related scripts. More precisely, $1
is the network interface on top of which the UCARP daemon works (it sends its own messages). Thus, you can create a virtual interface like net0:1
. Additionally, $2
is the virtual IP address (the VIP or the floating IP) that the script must configure. All of these values are defined in the ucarp.cfg
file.
Moreover, at the end of these scripts, you can add commands that run when the takeover occurs or when a node becomes the MASTER node or STANDBY node. For example, you can send email, start or stop one or more services, send a Nagios passive check, and so on. You also can start Apache as soon as the node assumes the MASTER role (and stop it when the node becomes the STANDBY node).
Now, it's time start the UCARP service on each zone with the command:
svcadm enable ucarp
You can see who is the MASTER simply by using the ifconfig
command. The result on the master zone will be:
net0:1: flags=201000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4,CoS> \ mtu 1500 index 2 inet 192.168.0.200 netmask ffffffff broadcast 192.168.0.255
To see what's going on, you can use the dmesg
command or look in the logfiles /var/adm/messages
and /var/svc/log/network-ucarp:default.log
.
Conclusion
That's it. Please note that by using UCARP as is, the VIP will not be bound to any service running on the zone, meaning that UCARP is not aware of whether the service that you want to be highly available (e.g., Apache) is up and running. You have to work on some manual solution to achieve that.
This approach has nothing to do with a system like Pacemaker, for example. However, by using UCARP, you can achieve the goal of having two or more highly available SmartOS zones.