Tools Comstar Lead image: Aleksandr Fomenkov, 123RF.com
Aleksandr Fomenkov, 123RF.com
 

Storage system with OpenSolaris and Comstar

Storage Star

What used to be the domain of expensive storage solutions is now available on a tight budget with the free OpenSolaris operating system and Comstar. You can set up your own storage system that supports both iSCSI and Fibre Channel. By Djordje Gladovic

The advanced ZFS filesystem by Sun, and now Oracle, heralded a major change in the storage sector. ZFS's amazing qualities are an obvious choice for a storage system, but Sun failed to provide a viable network-attached storage system. Sun engineers knew the implementation of a modern framework that served up block devices throughout the network was the next step. The resulting product was Comstar (Common Multi-Protocol SCSI Target, [1]), a subsystem that uses current and future media and protocols to provide storage in the form of block devices. In combination with OpenSolaris [2], ZFS, and the NFS and CIFS implementation, Comstar anchors a fully unified storage system.

ZFS isn't actually a requirement: Comstar is capable of providing files and partitions (known as slices in Solaris-speak) to arbitrary consumers in the form of logical units.

But ZFS adds value in the form of snapshots, clones, and thin provisioning, and it is thus very much worth using. The similarities with a NetApp storage system are hard to overlook [3].

You can take your first steps toward OpenSolaris and Comstar on a virtual machine, although a virtual configuration restricts the choice of transport protocols to iSCSI. This approach is comparable to the installation of the popular Openfiler system on Linux. Incidentally, Oracle has previously used Openfiler in the Oracle RAC and Oracle Grid HowTos. After acquiring Sun, Oracle might want to change this and replace Openfiler with OpenSolaris and Comstar.

The deployment of an OpenSolaris/Comstar system involves the following steps:

The requirements for the virtual machine you use with your initial tests are quite reasonable: 1GB memory, a virtual CPU, about 20GB of disk space, and a bridged network adapter with an Internet connection will do for now. Sun xVM VirtualBox is a good choice for a virtualization product, but VMware Workstation is fine, too. Both solutions include templates for OpenSolaris and Solaris 10 installations.

The OpenSolaris install shouldn't pose any problems. Afterward, you will probably want to install the vendor-specific tools (such as the VMware tools) to improve the managability of the virtual guest.

iSCSI Target

The first step is to configure OpenSolaris with Comstar as an iSCSI target. You will need a static IP address. To avoid making the setup too complex, I assume you have a network connection with Internet access and simply need to assign an IP address to the virtual machine. If you operate a DHCP server, you can assign a static address for the machine via DHCP.

Likely, it's easier to assign a static IP address as root on the OpenSolaris guest in a terminal window:

$ pfexec su
# echo 192.168.1.200 > /etc/hostname.e1000g0

In this example, e1000g0 is the primary network interface. The following command lists the network interfaces:

# ifconfig -a

You have to add the device name of the network adapter you configured to the /etc/hostname file – compare the MAC address with the setting in the virtualization software. The commands in Listing 1 complete the network configuration for the OpenSolaris system.

Listing 1: Network Configuration

01 # echo myiscsiserver > /etc/nodename (Setzen des Computernamens)
02 # echo 192.168.1.0 255.255.255.0 >> /etc/netmasks
03 # echo 192.168.1.1 > /etc/defaultrouter (Setzen des Standardgateways)
04 # vi /etc/resolv.conf (Konfigurieren der Namensauflösung)
05 search zuhause.local
06 nameserver 192.168.1.1
07 # vi /etc/nsswitch.conf (Konfigurieren der Namensauflösung)
08 ...
09 hosts files dns
10 ipnodes files dns
11 ...
12 # vi /etc/hosts
13 ::1 localhost
14 127.0.0.1 localhost
15 192.168.1.200 myiscsiserver myiscsiserver.local loghost
16 # svcadm disable network/physical:nwam
17 # svcadm enable network/physical:default

Now reboot OpenSolaris; the system should be listening on the IP address 192.168.1.200 now. If all of the previous commands worked as intended, and DNS and the standard gateway are correctly configured, the new server now has Internet access, which you will need for the Comstar installation. Now, become the system administrator, install the required package, and reboot:

$ pfexec su
# pkg install storage-server SUNWiscsit
# reboot

After rebooting the system, enable Comstar as an OpenSolaris service:

$ pfexec svcadm enable stmf

From now on, you have a full-fledged storage system that supports both iSCSI and Fibre Channel – assuming you have the right hardware.

To add another ZFS pool for the future block devices, you additionally need a virtual disk. To do this, you first need to shut down the OpenSolaris system as follows:

$ pfexec shutdown -y -g0 -i5

In the virtual machine settings, you can assign a new disk with sufficient space to the guest. Then restart the OpenSolaris guest and, after logging on, create the ZFS pool with the disk you added. To discover the device name, use the format command.

The command for creating the ZFS pool will be something like:

zfs create mypool c8t1d0

Listing 2 recapitulates the whole procedure. From now on, the disk space is available to the Comstar framework. As mentioned previously, this implementation uses ZFS volumes. If you prefer to use a container file or a partition, you can check out the documentation to discover the required steps. The benefits of ZFS are quite significant: The ZFS volume is capable of creating snapshots and cloning.

Listing 2: New ZFS Pool

01 $ pfexec su
02 # format
03 AVAILABLE DISK SELECTIONS:
04 0. c8t0d0 <DEFAULT cyl 2607 alt 2 hd 255 sec 63>
05 /pci@0,0/pci15ad,1976@10/sd@0,0
06 1. c8t1d0 <VMware,-VMware Virtual S-1.0-10.00GB>
07 /pci@0,0/pci15ad,1976@10/sd@1,0
08 #
09 # zpool create mypool c8t1d0
10 # zpool status -v
11 pool: mypool
12 state: ONLINE
13 scrub: none requested
14 config:
15 NAME STATE READ WRITE CKSUM
16 mypool ONLINE 0 0 0
17 c8t1d0 ONLINE 0 0 0
18 errors: No known data errors

On top of this, you can expect performance gains thanks to the ZFS filesystem. In a production environment, you can improve this even more by introducing read and write cache devices on a solid-state disk (SSD).

The following command creates the first ZFS volume:

$ pfexec su
# zfs create -V 2G mypool/vol1

Now you have a 2GB volume in the mypool ZFS pool. The matching raw device is located in /dev/zvol/rdsk/mypool/vol1. Next, you need to introduce the raw device to the Comstar framework as a new logical unit:

$ pfexec su
# sdbadm create-lu /dev/zvol/rdsk/mypool/vol1
Created the following LU:
GUID ...
-------------------------------- ...
600144f05f71070000004bc981ec0001 ...
/dev/zvol/rdsk/mypool/vol1

This command creates the logical unit, which is assigned a unique identifier, or GUID, by the Comstar framework. You can use the GUID to reference the logical unit uniquely in the following steps. The GUID of a logical unit can be ascertained by typing sdbadm list-lu, which lists all the logical units.

Be Seen

Now you need to publicize the logical unit on the SAN, otherwise it will be invisible to the clients. For an initial test, you can do this:

# stmfadm add-view 600144f05f71070000004bc981ec0001

This is the GUID of the logical unit that was created earlier. From now on, you can use any protocol to address the block device.

You simply need to configure and enable the corresponding Comstar target device, which is iSCSI, initially. The Comstar iSCSI target implementation is fairly simple. Two steps are all it takes to make the logical unit accessible on the IP SAN:

# svcadm enable iscsi/target
# itadm create-target
  Target iqn.1986-03.com.sun:02:8f4cd1fa-b81d-U
c42b-c008-a70649501262 successfully created

When the iSCSI target is created, an iSCSI Qualified Name (IQN) for the target is generated automatically; you need this name in the steps that follow. The iSCSI client (or initiator in iSCSI-speak) in our lab is Ubuntu 9.10. Listing 3 shows the steps to mount a Comstar block device with the Linux Open iSCSI initiator.

Listing 3: Block Device and Initiator

01 $ sudo -s
02 # aptitude install open-iscsi
03 # vi /etc/iscsi/iscsid.conf
04 node.startup = automatic
05 # /etc/init.d/open-iscsi restart
06 # iscsiadm -m discovery -t st -p 192.168.209.200
07 192.168.209.200:3260,1 iqn.1986-03.com.sun:02:8f4cd1fa-b81d-c42b-c008-a70649501262
08 # iscsiadm -m node
09 # /etc/init.d/open-iscsi restart
10 # fdisk -l
11 Disk /dev/sdb: 2147 MB, 2147418112 bytes
12 67 heads, 62 sectors/track, 1009 cylinders
13 Units = cylinders of 4154 * 512 = 2126848 bytes
14 Disk identifier: 0x00000000
15
16 Disk /dev/sdb doesn&apos;t contain a valid partition table
17 # fdisk /dev/sdb
18 # mkfs.ext3 /dev/sdb1
19 # mkdir /vol1
20 # mount /dev/sdb1 /vol1

From now on, the block device published by Comstar is available as /vol1. To add more block devices, create a new ZFS volume, advertise it as described earlier, and mount the device on the client system. However, the options that ZFS gives you are more interesting. If you start to run out of space on a volume, you can use ZFS to increase the size (Listing 4).

Listing 4: Expanding the ZFS Volume

01 01 # zfs list
02 02 NAME USED AVAIL REFER MOUNTPOINT
03 03 mypool 2.00G 7.78G 19K /mypool
04 04 mypool/vol1 2G 9.72G 65.3M -
05 05 # zfs get volsize mypool/vol1
06 06 NAME PROPERTY VALUE SOURCE
07 07 mypool/vol1 volsize 2G -
08
09 The volume is currently 2GB. The following commands expand it to 3GB.
10
11 01 # zfs set volsize=3g mypool/vol1
12 02 # zfs get volsize mypool/vol1
13 03 NAME PROPERTY VALUE SOURCE
14 04 mypool/vol1 volsize 3G -
15 05 # sbdadm modify-lu -s 3g 600144f05f71070000004bc981ec0001

The GUID must be the same as the logical unit in your installation. sbdadm list-lu gives an overview of the existing logical units. Reboot the Linux client and use Gparted, for example, to extend the ext3 partition. The results should look like this:

# mount /dev/sdb1 /vol1
# cd /vol1
# df -h .
Filesystem Size Used Avail Use%Mounted on
/dev/sdb1 3.0G 36M 2.8G 2%/vol1

The ZFS volume now provides the preset 3GB of space. On OpenSolaris, you can actually do this on the fly. The ability to enlarge ZFS volumes is just one of the many features of the ZFS filesystem. It can also, for example, create a snapshot of the volume in a couple of milliseconds:

# zfs snapshot mypool/vol1@2010-04-17-1

If the data are consistent in this period, you can download the snapshot to a tape as a backup. Systems such as Oracle databases provide options for creating consistent snapshots during operations, thus supporting backups without downtime.

Writable Snapshots

Another of ZFS's features is clones – that is, writable copies of a snapshot. Just like a ZFS volume, you can use Comstar to export the clone and then mount the clone on a test system. This gives you a complete copy of a production database and the ability to run tests without endangering your live data. This method is a typical approach for creating complete test systems. The following command:

# zfs clone mypool/vol1@2010-04-17-1 mypool/vol1-test

creates a clone from the snapshot captured previously, assigning the name of vol1-test.

Now you can publish the clone on the IP SAN, just as you did the previous volume vol1:

# sbdadm create-lu /dev/zvol/rdsk/mypool/vol1-test
Created the following LU:
GUID ...
-------------------------------- ...
600144f05f71070000004bc999020002 /dev/zvol/rdsk/mypool/vol1-test
# stmfadm add-view 600144f05f71070000004bc999020002

After a short wait, you should see another block device appear on the client system. Because the new volume is a clone of the original vol1, the partition table and filesystem are available. Rebooting the Ubuntu system and entering the following two commands

# mkdir /vol1-test
# mount /dev/sdd1 /vol1-test

will mount the clone. In a production environment, you would have a test system for this. The iSCSI demo shows how easy it is to create a professional storage system with OpenSolaris, ZFS, and Comstar; an equivalent storage system from a premium provider such as NetApp, EMC, or HP would require significantly more expense.

LUN Masking and iSCSI Portal

Thus far, this guide has focused on one target (server) and one initiator (client). In a production environment, multiple clients will need disk space. Logical Unit Number (LUN) masking prevents a client accessing a block device not intended for it by publishing a logical unit to the intended client systems only. All other potential clients do not see the logical unit, even if they are on the same IP subnet and can connect to the server.

LUN masking per logical unit is typically applied to just one initiator, but if you use a cluster-capable filesystem, you can assign multiple clients to the same logical unit. LUN masking is very simple: You explicitly assign the target you want to publish to the logical unit and the consuming initiator (host). Addressing uses typical iSCSI-style IQNs. In Comstar, you create a target group (TG) and a host group (HG) and add the required IQNs.

The final step is to publish the logical unit and point to the host group and target group you created above (Listing 5). On an Ubuntu 9.10 Linux system, the following command identifies the iSCSI initiator IQN

Listing 5: LUN Masking

01 # stmfadm create-hg myiscsiclient
02 # stmfadm add-hg-member -g myiscsiclient iqn.1993-08.org.debian:01:e8643eb578bf
03 # stmfadm create-tg myiscsiserver
04 # stmfadm add-tg-member -g myiscsiserver iqn.1986-03.com.sun:02:8f4cd1fa-b81d-c42b-c008-a70649501262
05 # stmfadm add-view -h myiscsiclient -t myiscsiserver -n 1 600144f05f71070000004bc999020002
# cat /etc/iscsi/initiatorname.iscsi
InitiatorName=iqn.1993-08.org.debian:01:e8643eb578bf

and this next command discovers the IQN for the OpenSolaris Comstar iSCSI target:

# itadm list-target
TARGET NAME STATE SESSIONS
iqn.1986-03.com.sun:02:8f4cd1fa-b81d-c42b-c008-a70649501262 online 1

From now on, the logical unit is only reachable via the IQN iqn.1986-03.com.sun:02:8f4cd1fa-b81d-c42b-c008-a70649501262 target, and it is only accessible via the initiator with the IQN of iqn.1993-08.org.debian:01:e8643eb578bf.

The logical unit has a LUN of 1. This number is comparable to the typical SCSI ID on a SCSI bus system. Each additional block device you publish via this target must have a unique LUN.

Of course, you can add more systems to the host or target groups created above. On a cluster system, you would add multiple hosts to the host group to allow all the systems in the cluster to see the logical unit. Changes to the target group are unusual, however.

Portal Groups

Comstar can manage multiple targets. You can assign any number of block devices or logical units (uniquely addressed by the LUN) to a target. In contrast, a target is always assigned to a Target Portal Group (TPG). By default, Comstar assigns a new target to the default TPG.

The itadm list-target -v command outputs the current target assignments. In the case here, you can see that the target is assigned to the default TPG (see Listing 6).

Listing 6: itadm list-target

01 # itadm list-target -v
02 TARGET NAME STATE SESSIONS
03 iqn.1986-03.com.sun:02:8f4cd1fa-b81d-c42b-c008-a70649501262 online 1
04 alias: -
05 auth: none (defaults)
06 targetchapuser: -
07 targetchapsecret: unset
08 tpg-tags: defaul

A TPG describes the IP connection data in the form of an IP address and a port – in iSCSI terms, this is an iSCSI portal – that provide access to the assigned targets. The default setting means that a target is reachable via an active IP interface and the default iSCSI port of 3260.

Although the target system has two Ethernet interfaces that have assigned IP addresses, you can access the published logical units via port 3260 and the two interfaces. Figure 1 demonstrates the relationships in the target portal group.

Comstar organizes components into target portal groups.
Figure 1: Comstar organizes components into target portal groups.

These tools let you publish multiple targets over multiple IP networks (IP-SANs). This helps, for example, to group systems that belong together logically in a style similar to zoning on Fibre Channel SANs.

As an added benefit, you can also distribute the load over all of the installed Ethernet interfaces. Currently, iSCSI is typically restricted to 1GB Ethernet; however, as 10GB Ethernet continues to make inroads, iSCSI portals will have sufficient bandwidth.

If you would like to route the target via a second Ethernet interface, you just need to add another network adapter to the virtual machine. Then, go on to assign a static address to the network interface using the same approach as for the primary network card:

# ifconfig e1000g1 plumb up
# ifconfig e1000g1 10.20.30.1 netmask 255.255.255.0
# echo 10.20.30.1 > /etc/hostname.e1000g1
# echo 10.20.30.0 255.255.255.0 >> /etc/netmasks

Also, you need to add a corresponding network adapter to the Linux client system; it has to reside in the same LAN segment as the new network interface on the OpenSolaris system. Make sure you configure the interfaces so the two Ethernet interfaces communicate via IP.

Masked

The next step is to introduce the Comstar framework to a new iSCSI portal in the form of a target portal group, which you then assign to the required target. From now on, the logical units are only accessible via the IP network on which the iSCSI portal resides – this assumes that you set up LUN masking (host group and target group) previously to support access to the logical units. In addition to the IP addresses, you still need an IQN and LUN masking.

In the iSCSI environment, now execute the commands in Listing 7. After running the commands, the target with the IQN in line 2 is only accessible via the iSCSI portal at IP 10.20.30.1, port 5000. The logical unit set up previously is no longer accessible via the local network 192.168.1.0/24.

Listing 7: iSCSI Portal

01 # itadm create-tpg myportal 10.20.30.1:5000
02 # itadm modify-target -t myportal iqn.1986-03.com.sun:02:8f4cd1fa-b81d-c42b-c008-a70649501262
03 Target iqn.1986-03.com.sun:02:8f4cd1fa-b81d-c42b-c008-a70649501262 successfully modified

For smaller installations, it is fine to use just one iSCSI portal and thus one target portal group. LACP and IP multipathing (on (Open)Solaris IPMP) allows the system to use redundant Ethernet connections as fail-safes. Anything beyond this adds a huge amount of complexity and requires good documentation to keep track of the configuration.

Fibre Channel

Thus far, I have only looked into iSCSI and ZFS. I mentioned earlier that Comstar can use any protocol to provide storage in the form of block devices. One of the protocols currently supported is the popular Fibre Channel, typically used by large SAN installations. With a transmission rate of 8Gbps and a low protocol overhead, Fibre Channel is capable of impressive performance. To add Fibre Channel capabilities to OpenSolaris and Comstar, you need a Fibre Channel host bus adapter (HBA). Also, you need to switch the HBA to target mode, and this requires a matching adapter.

As of this writing, Comstar only supports the QLogic adapter with 4 and 8Gbps ports and any host bus adapter from Emulex Enterprise. The Emulex driver implements target mode, which vastly simplifies the configuration. If you use a QLogic adapter, you will need to delve deeper into your bag of tricks.

The following setup is not possible with virtual machines and thus requires at least two physical computers with matching host bus adapters. Ideally, you will have a couple of fast disks in your OpenSolaris system and have them grouped to form a ZFS pool. It is a good idea to restrict initial experiments to a single server and a single client. Point-to-point wiring means you can do without a SAN switch, at least initially; however, if you will be adding multiple systems to your OpenSolaris storage, there is no alternative to investing in a SAN switch.

The following example uses a Sun Fire X4100 server with just a bunch of disk (JBOD) storage as an OpenSolaris target system. The Emulex LP9002L host bus adapter is connected to a Brocade 3850 SAN switch. The client is an HP ProLiant BL465c blade server running CentOS Linux 5.4 and also connected to the SAN switch. Zoning on the SAN switch is configured to allow the client and the server to communicate. Figure 2 shows the demo system.

Test system setup with Fibre Channel.
Figure 2: Test system setup with Fibre Channel.

The Comstar setup for Fibre Channel is fairly similar to the previous iSCSI example with virtual machines; addressing is the major difference.

Address Multiplicity

In Fibre Channel, devices are uniquely referenced by their World Wide Node Name (WWNN) and World Wide Port Name (WWPN). Each device will always have a WWNN. Because Fibre Channel adapters often have more than one connection port, a unique WWPN is additionally assigned to each port. Figure 3 shows this principle with a dual-port host bus adapter with a WWNN and its WWPNs. WWPNs are basically similar to MAC addresses on Ethernet interfaces.

The World Wide Node Name (WWNN) and the two World Wide Port Names (WWPNs) on a dual port Fibre Channel adapter.
Figure 3: The World Wide Node Name (WWNN) and the two World Wide Port Names (WWPNs) on a dual port Fibre Channel adapter.

For Comstar, you only need the WWPNs for the ports connected to each other. If you are using a SAN switch, you only need the WWPNs that belong to a zone on the switch. To switch an Emulex Enterprise adapter (e.g., the Emulex LP9002L) to target mode, modify the driver configuration as follows:

$ pfexec su
# vi /kernel/drv/emlxs.conf
target-mode=1;
emlxs0-target-mode=0;
emlxs1-target-mode=1;
# reboot

Make sure you set the target-mode parameter to 1. This is the only setting in which target mode is active. In this example, I use two host bus adapters. Setting the emlxs0-target-mode parameter to 0 disables the first of the installed Emulex HBAs as a target device, although a Fibre Channel JBOD with a disk array is attached to it.

The second Emulex adapter is configured as a target, which I will use to publish a ZFS volume on the SAN. To do so, I create a ZFS volume, as described in the previous iSCSI example, configure the volume as a Comstar block device, then publish the logical unit (Listing 8).

Listing 8: Setup for Fibre Channel

01 # zfs create -V 10g pool-0/vol1
02 # sbdadm create-lu /dev/zvol/rdsk/pool-0/vol1
03 Created the following LU:
04
05 GUID DATA SIZE SOURCE
06 -----------------------------------------------------------------------
07 600144f021ef4c0000004bcb3fb70001 10737352704 /dev/zvol/rdsk/pool-0/vol1
08 # svcadm disable stmf
09 # fcinfo hba-port
10 ?
11 HBA Port WWN: 10000000c93805a6
12 Port Mode: Target
13 Port ID: 10200
14 OS Device Name: Not Applicable
15 Manufacturer: Emulex
16 Model: LP9002L
17 ?
18 # stmfadm create-tg myfcserver
19 # stmfadm add-tg-member -g myfcserver wwn.10000000c93805a6
20 # stmfadm create-hg myfcclient
21 # stmfadm add-hg-member -g myfcclient wwn.50060b0000c26204
22 # svcadm enable stmf
23 # stmfadm add-view -t myfcserver -h myfcclient -n 1 600144f021ef4c0000004bcb3fb70001

This process provides a logical unit with a LUN of 1 for the client with a WWPN of 50:06:0b:00:00:c2:62:04. The fcinfo hba-port command discovers the OpenSolaris system's WWPN. The command lists all the data for the installed Fibre Channel host bus adapters. With the output, you can identify the WWPN and then configure the target group to reflect it. The WWPN for the CentOS Linux client system is discovered by Listing 9. For other flavors of Linux, you might need other commands.

Listing 9: Fibre Channel WWPN on Centos

01 # systool -c fc_host -v
02 Class = "fc_host"
03
04 Class Device = "host0"
05 Class Device path = "/sys/class/fc_host/host0"
06 fabric_name = "0x100000051e3447e3"
07 issue_lip = <store method only>
08 node_name = "0x50060b0000c26205"
09 port_id = "0x010901"
10 port_name = "0x50060b0000c26204"
11 port_state = "Online"
12 port_type = "NPort (fabric via point-to-point)"
13 speed = "4 Gbit"
14 supported_classes = "Class 3"
15 supported_speeds = "1 Gbit, 2 Gbit, 4 Gbit"
16 symbolic_name = "QMH2462 FW:v4.04.09 DVR:v8.03.00.10.05.04-k"
17 system_hostname = ""
18 tgtid_bind_type = "wwpn (World Wide Port Name)"
19 uevent = <store method only>

If all of this works and you have the right WWPNs, you should see a new block device appear on your Linux system after a short while:

# fdisk -l
Disk /dev/sda: 10.7 GB, 10737352704 bytes
64 heads, 32 sectors/track, 10239 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Disk /dev/sda doesn&apos;t contain a valid partition table

Just as in the iSCSI example, you can create a filesystem on the block device and leverage the benefits of ZFS. In combination with a ZFS pool made up of high-performance disks, this gives you a powerful Fibre Channel storage system. With some TLC, the system should also be multipathing-capable, although you will not be able to configure a fail-safe storage cluster right now.

Conclusions

OpenSolaris, ZFS, and Comstar are powerful components for implementing storage systems, and the system is easy to set up, although it does not offer a GUI. Comstar's particular appeal is its modular structure, which lets you export virtually any kind of block device via any supported protocol. A logical unit can be accessible via iSCSI and Fibre Channel at the same time. The community is working on more protocols – for example, an Infiniband implementation is available for high-performance apps. The ZFS filesystem, cheap and big SATA disks, and solid-state disks as a fast cache, create a powerful, low-budget storage system.