The need for high-availability features with Microsoft's SQL Server is growing fast, especially in the age of the cloud. However, instead of providing the much-needed high-availability features to a wide user base, Microsoft has scrapped database mirroring and restricted its successor, the AlwaysOn high-availability groups introduced with SQL Server 2012 and upgraded in SQL Server 2014, to the Enterprise Edition.
Better Safe than Sorry
High availability (HA) maximizes the accessibility of database servers by mitigating the effect of hardware damage, compensating for the failure of virtualized IaaS services or buffering failure of the network infrastructure. Thanks to high-availability features, downtime can be minimized from the perspective of the end user, and you can largely rule out the loss of transaction data.
High availability of a database, however, is difficult to achieve because you must always ensure the integrity of transactional data. If this integrity is lost, things can quickly become expensive. In the 2012 version, MS SQL Server met these requirements with up to three high-availability features, depending on the configuration:
- Database mirroring
- AlwaysOn Failover Cluster Instances
- AlwaysOn Availability Groups (only available in the Enterprise Edition)
Additionally, features like database replication and log forwarding guarantee data backup in an emergency.
Database Mirroring Discontinued
Synchronous database mirroring in MS SQL Server is a high-availability solution, whereas asynchronous database mirroring (only available in the Enterprise Edition) is a solution for restoring data in an emergency. In any case, the database mirroring environment consists of two instances with local storage: a principal server with the principal database, and a mirror server with the mirror database. In high-security mode, automatic failover can take place. This configuration requires a third server, called the witness. The witness ensures that, in case of a failure of the principal server, the mirror server can take over its tasks seamlessly.
In synchronous database mirroring, the principal server passes the commit of each transaction directly to the mirror server and only reports the process as complete after receiving confirmation. The secondary server can take over the tasks of the primary server at any time. Information from outside of the database (e.g., jobs, joins, and transactions across multiple databases) is either lost or cannot be correctly taken over. Moreover, this solution is slow.
In contrast, asynchronous database mirroring occurs with a considerable time delay and is thus not suitable for automatic failover. It is, however, an option for data backup and disaster recovery.
Microsoft has now declared the database mirroring feature to be deprecated. Users of SQL Server are advised not to use this feature and not to develop new applications for it. As a substitute, Microsoft recommends AlwaysOn High Availability Groups. The only flaw: Each server instance in a high-availability group requires the Enterprise Edition of SQL Server. Just the license fees for minimum equipment for the smallest version of a high-availability group total more than US$ 100,000 for a period of three years.
Database replication can improve availability of the database and also enable disaster recovery (DR) of the data. Replication in SQL Server is based on the publish/subscribe model. In this model, a primary server (the "publisher") passes its data, or a subset of the data, to at least one secondary server (the "subscriber"). In SQL Server 2012, the publication database can belong to an AlwaysOn availability group.
The biggest advantage of database replication is its flexibility. The SQL Server can also publish a subset of a database if necessary. Additionally, different server instances can be provided with separate indexes. This approach allows you to instruct the replica instance to create detailed reports, for example and relieve the load on production servers. The main disadvantage of database replication is the considerable complexity, which also entails a serious potential for error.
Log forwarding involves the primary database server writing a backup of its transaction log to a shared volume. Other database servers can then access this volume from a remote location to collect the logs and, from this, restore the database locally. Users can send queries to these restored databases. However, recovery and the queries must not occur simultaneously; before restoring, all running queries must be terminated forcibly. Log forwarding provides a robust method for database recovery in case of emergency. Log forwarding is supported by the Enterprise, Business Intelligence (BI), Standard, and Web SQL Server editions. (See the "Standard and Enterprise" box for more information on supported features.)
AlwaysOn Failover Cluster
AlwaysOn Failover Cluster instances provide failover from a node that is currently not available to another node using Windows Server Failover Clustering (WSFC). Local high availability is provided in this case by redundant server instances; these failover cluster instances (FCIs) share SAN storage.
An FCI is a single instance of SQL Server distributed over several WSFC nodes (possibly across multiple subnets). On the network, this installation looks like a single computer. Microsoft's SQL Server Management Studio connects to the cluster as to an ordinary host.
The BI and Standard editions of SQL Server 2012 and 2014 each support exactly two nodes. The Enterprise version pushes this to the operating system's limits. If the active node fails, failover occurs fully automatically: The second node takes over the tasks of the first node. For the end users, nothing has changed; they can continue with their work. However, incomplete transactions are mandatorily discarded. Completed transactions are kept.
AlwaysOn Failover Cluster instances protect the environment against the failure of a server in the cluster but not against the failure of the SAN storage. The data is therefore not automatically saved redundantly; on the contrary, all nodes simply access the same shared SAN storage. A hardware defect here can thus easily lead to the loss of all data. To mitigate the risk of data loss, SAN replication can be used. In the case of a primary SAN storage failure, you also have the option of manually sharing the SAN replica in the cluster. Unfortunately, this solution entails considerable additional costs and requires the involvement of a SAN administrator.
AlwaysOn HA in SQL Server
AlwaysOn High Availability Groups replace the functionality of database mirroring and log forwarding through improvements such as log synchronization. AlwaysOn High Availability Groups rely on WSFC to ensure redundancy at the database level and allow the use of the secondary server for read requests. Unlike the case of database replication, indices and schemas must be identical on all instances.
An availability group is a failover environment for user databases (known as availability databases). Failover between synchronous replicas in an AlwaysOn High Availability Group is fully automatic. Asynchronous replicas are only suitable for manual failover, but they can be located in different data centers and run on completely different hardware.
Microsoft introduced AlwaysOn High Availability Groups in SQL Server 2012 and has expanded this technology in version 2014. SQL Server 2014 now supports a total of eight instead of four previously supported secondary replicas. In multisite environments, these initially remain readable, even if a connection to the primary server cannot be established. Admins who set up high-availability groups in various, geographically distributed data centers will benefit most from this approach. Asynchronous replicas can relieve the primary instances by handling some of the read requests. SQL Server 2014 introduces the ability to manage the load on the high-availability group with a load balancer. (See the "Test Drive" box for information on trying out SQL Server 2014.)
SQL Server 2014 integrates seamlessly with Windows Azure. Users of SQL Server in the Microsoft cloud can use SQL Server 2014 as a synchronous secondary replica for the Azure services. The reverse scenario is also possible: If you use SQL Server internally, you can asynchronously replicate its databases to Windows Azure for manual failover in case of emergency.
Unfortunately, AlwaysOn High Availability Groups require the Enterprise Edition in both SQL Server 2012 and 2014. Database mirroring, which you can expect to be removed from the next version of SQL Server 2014 was even available in the Standard Edition. Thus, it is unfortunate that Microsoft has not given the Standard Edition of SQL Server 2012 and 2014 at least the lowest level of AlwaysOn High Availability Groups. The absence of high-availability technologies in the Standard Edition therefore practically means a gradual price increase. Many users of SQL Server thus see themselves forced to consider alternative solutions.
Hybrid Deployment Scenarios
Migration "into the cloud" long seemed to be a clear trend. With industrial espionage and the NSA scandal making the headlines, however, data security has increasingly gained relevance. On-premises hosting and co-location in a data center still remain the preferred route for many companies.
On-premises use is exactly the opposite of a public cloud: It offers ultimate control over data security while generating higher costs (hardware and software, personnel costs, Internet connection, etc.). The available performance needs to be scaled up; companies concerned cannot avoid investing. Additionally, companies have no viable way to make excess capacity available to others and help the investment pay back faster. In an on-premises scenario, AlwaysOn Failover Cluster Instances in the Standard Edition of SQL Server with SAN storage are a viable option. In the cloud, things look different.
High Availability in the Cloud
Running SQL Server in a virtualized cloud environment can achieve cost reductions of up to 80% compared with on-premise use. However, virtualization introduces the additional risk of failure of the infrastructure itself, which, in turn, makes high-availability solutions in the cloud all the more essential. Unfortunately, these are not as easy to implement – especially with SQL Server.
A good example is Amazon Web Services (AWS, the leading IaaS provider). AWS offers two solutions for SQL Server: EC2/VPC (Elastic Compute Cloud/Virtual Private Cloud) and a turnkey service known as RDS (Relational Database Service). Both services also support other databases, including MySQL and Oracle.
Companies that want to migrate their MS SQL server environment with failover cluster instances into the AWS cloud and implement in EC2/VPC are – unfortunately – in for a nasty surprise: The use of AlwaysOn Failover Cluster Instances requires the configuration of two EC2 instances on the same subnet or the use of shared SAN storage. Neither of these scenarios can be currently implemented on AWS. The only way of implementing high-availability with SQL Server that is supported in EC2/VPC is the use of AlwaysOn High Availability Groups with the Enterprise Edition.
SQL Server EC2/VPC
Amazon's infrastructure in each region is divided into availability zones (AZs). The failure of one zone does not affect the availability of the infrastructure in another zone. The latency between the different zones within a region is very small. This architecture is thus suitable for building an AlwaysOn Availability Group with SQL Server Enterprise.
A successful installation of SQL Server with AlwaysOn High Availability Groups typically requires the following components (using AWS in a VPC with subnets in two different availability zones as an example):
- At least two instances of SQL Server – that is, 4x2 licenses of SQL Server Enterprise (SQL Server licenses are available for two virtual cores; the smallest supported instance thus needs two SQL Server license packages for four cores)
- A Windows Server Failover Cluster (WSFC) with two nodes (one node per availability zone), each hosting one of the two installations of SQL Server
- Two instances of Windows Server with Active Directory (one per availability zone)
- Two instances as an intermediary for administrative access using Remote Desktop Gateway (one each per availability zone)
- Two NAT instances running Linux for administrative access to the infrastructure of the VPC in each availability zone.
This configuration supports automatic failover between two WSFC nodes in two different availability zones in a region.
Incidentally, Microsoft offers its own alternative to Amazon EC2: Windows Azure Virtual Machines . As with Amazon, users can also set up their own SQL Server licenses on an instance of Windows Server or rent one of the factory preinstalled instances with a license in place at an hourly rate.
High Availability as a Service
Another solution is available for AWS users with SQL server high-availability requirements who need to watch their budgets: Amazon RDS services for SQL Server.
The astronomical costs of high availability and horizontal scalability of SQL Server have led some vendors to provide their own cloud services with these capabilities. Amazon is one of the pioneers with RDS.
RDS sees Amazon assume responsibility for data integrity with automatic backups of the entire RDS instance (daily) and automatic backups of transaction logs (every five minutes). If the instance fails because of a hardware fault, an automatic failover occurs. Users also can create their own snapshots of the database as needed and restore manually.
In RDS, however, Amazon prevents access to the filesystem of your RDS instances, access via a remote desktop connection, and the use of tools from a third party – all of which are good reasons for experienced SQL Server administrators to steer clear of Amazon RDS. Functions such as log forwarding or Windows authentication with Active Directory are also ruled out. Users will, however, tend to appreciate the intuitive scalability of RDS.
VMware tries to compensate for the shortcomings of RDS with the vFabric Data Director , which lets users set up a Windows instance autonomously. Microsoft also offers a custom service: Windows Azure SQL Database (WASD) . Unfortunately, the service does not offer the full functionality of SQL Server and lacks, among other things, database partitioning, the Resource Governor, the Service Broker, common language runtime (CLR), and many other features. A severely restricted feature scope like this will be of little use to many users.
In SQL Server 2014, all of the state-of-the-art high-availability features – that is, those that have a future in Redmond – are served to users of the Enterprise Edition. It seems Microsoft is increasingly curtailing the standard functionality of SQL Server with each new edition. Standard Edition users get less CPU power and less memory and need not even bother asking about much needed features such as a successor to database mirroring.
Users of the Standard Edition are thus out on a limb; after all, the licensing costs of the Enterprise Edition cannot be justified in smaller businesses. Third-party suppliers such as Amazon with RDS and VMware with vFabric Data Director are trying to fill this gap, but their solutions still leave a lot to be desired.
Meanwhile, Microsoft seems to be trying to spoil users of the Enterprise Edition with new features. Robust AlwaysOn High Availability with up to eight readable secondary replicas for hybrid production use between on-premises environments and the cloud is clearly one the most desirable features of SQL Server 2014, and it provides users with the Enterprise Edition a clear competitive advantage.