Archive for the ‘Clustering’ Category
Introduction to Clustering
Written by Kendall Miller on February 27, 2008 – 12:58 amClustering takes a group of like devices (often servers, but it applies equally to appliances) together so they act, at least in some respects, like one device. Generally clusters are created to provide greater scalability at a lower price point or better availability (or both). To simplify matters, we’re going to restrict our discussion to clustering for network appliances (like firewalls) and common IT uses such as web servers, database servers, etc. In particular, we’re going to exclude grid computing (also known as compute clusters) and some other boundary cases. If you’re working in one of them, you’re probably not reading this introduction to clustering.
First a little lingo…
To make it easier to discuss below, lets introduce a few terms and define how they’ll be used in the rest of this article.
The general term for each computer or appliance that is a member of a cluster is a node. In general, each node is identical with respect to the service being clustered (e.g. if a web site is being clustered, all nodes have the same opinion of what that web site is).
The two main types of clustering are High-availability (HA) or failover clusters and Load-balancing clusters. In both cases more than one system can handle a given service, but they differ in whether multiple systems can be active at the same time (they can for load-balancing clusters, they can’t for high-availability clusters). Because this is the primary distinction, I prefer to use the terms failover and load-balancing because both provide high availability. In broad strokes, load balancing clusters are generally preferable to failover clusters because you get value all of the time for your investment in high availability (additional throughput) and there is generally little or no delay in moving resources from a system that fails.
Failover Clusters
Failover clusters…
- Provide high availability only, they do not improve performance at best… there may even be a slight drop in performance depending on how the clustering is done.
- Often have a short delay in transitioning resources from one active node to another. Requests that come during that time can fail.
- Often require each node in the cluster to be absolutely identical for reliable operation.
Common Examples
Failover clustering is your best bet for clustering resources that due to technology constraints can’t be done in a load balanced cluster. This is usually anything that rapidly writes data (like databases) or anything with tight network-level performance constraints (because of how TCP/IP works, it’s very hard to make very low level load balancing work). In most companies, the key reason they implement this is for their firewall and their database server.
- Microsoft Cluster Service (MSCS): This is the built-in Windows method of creating failover clusters. It supports Microsoft SQL Server, Exchange Server, file shares, and a range of other systems out of the box. It generally uses shared storage (a SAN is highly recommended, but it can be done with direct attach storage or anything else where you can replicate the storage absolutely) to keep each node data synchronized. For more information, see Why You Should Use MSCS.
- Firewalls and Hardware Load Balancers: Most network-layer devices use this for high availability, such as firewalls from companies like Watchguard and Cisco and hardware load balancers from companies like Foundry and F5. Note that in this case we’re talking about the appliances themselves, even though they may be what performs load balancing for a cluster (see below).
Application Compatibility
Generally this is easier to ensure application compatibility than load balancing because it preserves the general characteristics of running without clustering: The application is only running in one place at a time, it has exclusive access to its storage, etc. For example, Microsoft Cluster Service (MSCS) can generally be used to cluster anything that’s a windows service without the service being specifically designed for it. Validation is also generally simpler for custom applications because it will tend to be binary - either it works and fails back & forth correctly, or it will fail pretty early in testing. Load balanced clusters conceptually have a much larger number of scenarios to test to exhaustively prove they work.
Load-balancing Clusters (aka server farms)
Load-balancing clusters:
- Provide high availability and improve scalability. Each node is processing requests so you can process more requests at the same time.
- Can be transparent or nearly so when a node fails.
- Usually accommodate diverse nodes with different performance capabilities, software load, etc.
Common Examples
The most common load balanced cluster is a front-end web server. This is because of the natural tendency to separate state management (storage) from the web application (often into a database) removing the first, largest hurdle to load balancing. Additionally, web applications are often developed very quickly using technologies that are not optimized for performance. This tends to make them processor & memory intensive under load which can be very cost-effectively addressed with hardware instead of custom development.
- Microsoft Windows Network Load Balancing (NLB): This performs basic load-balancing, typically for web servers but it can be used for other systems in certain cases. There are significant limitations in network scalability and management tools. The network scalability limitations depend highly on how sophisticated your network switching hardware is.
- Load Balancing Appliance: F5 Networks BIG-IP have long been considered the gold standard in hardware load balancing appliances, but are difficult to spec up and administer unless you’re used to old-school UNIX administration. They are also very expensive when all you need is web site load balancing. There are a range of options that generally fall into two price classes based on whether the vendor believes they can accomplish anything for anyone (like Cisco, F5 Networks, etc.) or are just focused on web server requirements, which generally cost substantially less and are easier to configure. If you don’t have experience with the particular hardware appliance you’ve selected, you should get some expert assistance to select and setup your solution. Be sure to get sufficient knowledge transfer to perform routine support on your own.
Application Compatibility
Ideally, each application you want to cluster will have a section describing their compatibility with load balanced clustering. It is typical to have slight configuration changes for clustering. For example, a clustered web application may need to be configured to store state within a database instead of the normal in-memory storage. If no such information is available, some basic validation can be done to see if it’s worth even attempting. If the application looks like it can be plausibly clustered, then a plan for carefully validating the clustering should be performed before it is put into production.
Testing Clusters
The Wire Never Lies
First, if you are not using an absolutely off-the-rack clustering scenario, you will need to get ready to inspect network traffic. While Microsoft has included a free tool to do so with Windows, I highly recommend Ethereal WireShark as the gold standard. It’s been said that “the wire never lies”, meaning that the physical network represents the real truth of what’s going on. Any senior server administrator should be able to do a network trace and understand what is communicating and why from the perspective of each server. The reason this is particularly important with clustering is that it will give you absolute proof of where traffic is going between each layer of your infrastructure, and can reveal unexpected surprises such as redirects you didn’t believe were happening. Web browsers, particularly IE, are designed for end users, so they tend to hide the true underlying network details or simplify what’s going on. Don’t trust what they present when validating a cluster or diagnosing an issue. Trust the actual packets on the wire. For more on how to do this, see The Wire Never Lies.
Failover Clusters
The big test whenever changing the configuration of your cluster is that it can successfully failover, work, and fail back. You want to be sure this works on command so that it’s ready to take over when called upon due to a real problem. It’s not good to discover that your redundant node won’t run the software correctly, automatically, when you have a failure in the active node.
Network Test Points
Because clustering will tend to play some interesting tricks at the physical network layer, you should test your clustering installation from at least two places: On the same routed network segment as the clustered IP Address and on another segment. It’s also useful to test on the same physical switch and a different switch. The reason for this is you want to know how quickly the transition will be considered effective by clients on the network, and this will vary depending on exactly how the clustering is done. For example, if the IP address is transferred but the MAC address isn’t, it can take a while before clients on the same network segment (that may have the MAC address cached) will drop their cache and ARP again for the new address. In the case of using Windows NLB, it requires a switch that correctly supports IGMP to work correctly. If the switch doesn’t work correctly, what will tend to happen is that you will get alternating failures and successes as the switch incorrectly routes traffic to just one NLB node. This is just an example, but it highlights that you want to think about how your traffic travels from the client to the server and what it passes through that has to understand about the clustered node. Typically this is limited to routers & switches on the same routed segment.
How has clustering benefited you?
What types of clustering do you use? Has it made a material difference in your reliability? Post your comments or drop me a line to continue the conversation.
Tags: Clustering, failover, NLB, Wireshark
Posted in Clustering | No Comments »
Why you should use Microsoft Cluster Service (MSCS)
Written by Kendall Miller on February 18, 2008 – 2:15 amIf you go through the web and do as much research as you can, you’ll find very polarized opinions about MSCS. I’ve been using it since 2002 and have found it to be outstanding, but I can see some pitfalls that could create a bad rap for it.
Why are you clustering?
First, I think Microsoft does it a miss-service in how they market it. Instinctively, most people focus on using MSCS in case a given computer’s hardware or operating system spontaneously fail. I’d say that in operating a number of clusters over six years in time, this was a very rare event for us. In fact, it only happened when we had some brand new hardware fail within its burn in period. Instead, we’ve found that its great value is in reducing downtime due to maintenance activities.
Example Server Update
Consider the scenario of needing to install the latest patches from Windows Update on your database server. Below are the steps you could go through without clustering:
- Wait until your maintenance window (let’s assume it’s 1:00 AM on Sunday morning, the low time of your load profile).
- Take the applications that use your database server offline (to be nice to your users and ensure everything closes).
- Install the patches on your database server
- Reboot your database server
- Verify that the server works (that the patches haven’t introduced a problem)
- Bring all applications back online
What’s noteworthy in the list above are the items that have a variable duration (it may take a different amount of time each time you do maintenance and may not be particularly predictable) vs. a fixed amount of time. In particular, #3 and #5 are variable (and #4 may be.).
Now lets play that again if you have MSCS installed:
- Install patches on the offline database server node.
- Reboot the offline server.
- Wait until your maintenance window
- Take the applications that use your database server offline (to be nice to your users and ensure everything closes)
- Failover to the offline server
- Verify that the server works (that the patches haven’t introduced a problem)
- Bring all applications back online.
- Wait a reasonable period of time (like a few days) and install patches on the server that’s now offline
- Reboot the offline server.
It is more steps (because there are two servers involved) but what we’ve done is moved things that take variable time outside of the critical window when the system is in maintenance mode. Everything that is happening during the maintenance mode (steps 4-7) is predictable. Additionally, I consider any server reboot to be risky. Problems tend to show up during a reboot that show up at no other time - hardware problems and even in a reasonably tight environment it’s possible there’s a configuration change made that hasn’t taken effect yet that will on reboot and cause a problem. With an MSCS cluster, this risky event is happening while the server is offline and won’t affect the production use of your application. You’ve also verified the basic integrity of the patches (after all - the server booted and you can monitor its event log to know its basically healthy) before even scheduling your maintenance period.
The comparison gets even better when you consider what happens in the first scenario above if you need to roll back a patch. With a cluster, you just fail back to the original node and you’re good to go. Without a cluster, you have to uninstall the patch, reboot, and re-certify.
Benefits Summary
- Clustering makes system maintenance predictable and short.
- Clustering lets you do risky things during main business hours instead of the middle of the night
- Clustering lets you roll back a change very quickly and easily
If you’re clustering for these reasons, you’ll get great value out of it.
How are you clustering?
Shared Storage - The Traditional Approach
Microsoft has worked to make MSCS work with a pretty broad range of hardware to their credit. Traditionally, MSCS depends on being able to expose disks to more than one server at the same time. This can be done with the traditional server direct attach storage (DAS) technology - SCSI (and now SAS) however it relies on a set of very intricate hardware - RAID controllers in each server, special cutover terminators in the storage enclosure, etc. There is a lot that can go wrong, and when it does you may lose all of your data. For example, the configuration in the RAID controllers has to agree on what the virtual disks are. The shared storage was used at least for a special drive (called the Quorum drive) that stored central cluster configuration data and defined who was the current active node of the cluster. Additionally, any clustered service (like Microsoft SQL Server or Exchange) would typically have its disks also shared between the nodes in the cluster. If you don’t need to split your clustered nodes into different data centers (to create a geodiverse or “stretch” cluster) then this is a solid and straightforward way to go.
What I recommend is that you use a storage technology that encapsulates all of the RAID technology separate from the servers and is based on a technology that is fundamentally oriented towards sharing disks with multiple servers. This way you minimize the configuration on each server and the probability that a difference between servers will lose data. The traditional way of doing that is with a Storage Area Network (SAN). If you consider the two primary SAN technologies (Fibre Channel and iSCSI) both are fundamentally about sharing storage with multiple servers.
If you are only installing a shared storage array for one cluster, you can technically do without the hardware that makes a SAN a SAN - you can have a shared array directly attached to two servers. Most storage arrays support this, and it’s a very cost effective way to get started with separate storage arrays and be able to build later on this foundation to make a full size SAN down the road to optimize your operating costs. You’ll realize another benefit which is that these arrays are almost universally much faster and more scalable than direct attach storage is, for a range of reasons. You’ll be amazed at how much scalability it adds to your database server.
Shared Nothing Approach
Possible in Windows Server 2003 R2 Enterprise, significantly improved in Windows Server 2008 is the ability to set up a cluster that doesn’t rely on the quorum drive being a single physical resource. Instead, it employs a third server (called the Witness server, which can’t actually host the clustered processes) that each node in the cluster can talk to across the network or voting between the servers in the case of three or more nodes being in the cluster itself. The elimination of requiring the quorum to be physically accessible to every node on the cluster means that services that don’t rely on shared storage (such as a simple Windows service) can be easily implemented. This can even extend to Microsoft SQL Server and Microsoft Exchange in their latest version because they are capable of replicating their own content through log shipping. The sheer number of options here can be a lot to sift through the first time, but the results are worth it.
My Personal Experience
I’ve always used a SAN from a major vendor that certified the SAN for use with MSCS, and never experienced problems with MSCS. Use them, or don’t use MSCS based on shared storage.
The most important factor to being successful with failover clustering is to use high quality hardware for the server and storage system. Look for vendors that have certified their systems for use as part of an MSCS cluster to ensure they got all of the little details right.
Where should you use MSCS?
MSCS is a failover cluster system. Use it when you can’t use a load-balanced clustering option. In general, this is when there’s a natural requirement to have just one of something at a time, most commonly databases (because to be performant they need exclusive access to their files). If you have a load-balanced clustering option, it’s probably going to be less expensive to set up and maintain than MSCS.
If your organization is a solid user of Microsoft SQL Server, I highly recommend investing in at least one MSCS cluster to host your SQL database servers. You can use a single physical cluster to host multiple SQL database servers, an option that makes it particularly cost effective. You can set server affinity so that two instances of SQL Server prefer to run on different physical servers within the cluster, giving you the best utilization of hardware while preserving redundancy It is somewhat more complicated to set up because you have to use logical servers from the start with SQL Server which you don’t have to if there is just one, however the cost savings can help justify clustering. You might, for example, have both a certification and production SQL Server on one pair of physical servers in an MSCS cluster. This makes it somewhat easier to ensure that your certification and production environments are absolutely identical and lets you generally separate certification and production from interfering with each other without having to purchase two separate clusters.
Advanced clustering scenarios
Remember that while most articles and documentation talk about the basic clustering case of two servers & a SAN or other shared storage, as of Windows Server 2003 you can have more than two nodes and can have them use separate shared storage, provided that you have a means to synchronize it. This can be used in a few great scenarios:
- Geodiversity: You can have two separate facilities, each with one or more servers and fail over between the facilities.
- Upgrades and Maintenance: You can use the ability to have additional nodes and separate storage to allow you to take the shared storage system entirely offline in the event of disruptive maintenance or upgrades. I’ve actually used this method to incrementally upgrade and replace cluster systems before where taking the risk of a complete switchover was considered too high.
Moving from basic clustering with a single shared storage array to separate storage arrays is a significant jump in complexity and typically cost because you have to have a highly reliable means to keep the arrays in sync. High end storage vendors typically have this capability for their arrays, and there are third party options that can work with anyone’s SAN. Remember that you will need significant network capacity between your sites. Suffice it to say that if you’re going to go down this road, you’ll want help from someone that’s done it before. I recommend engaging storage professionals because this tends to be the most difficult part of the process.
What’s your experience?
Have you used MSCS? How has it worked out for you? Post your comments or drop me a line to continue the conversation.
Tags: Clustering, High Availability, Infrastructure, MSCS, SAN
Posted in Clustering | No Comments »
Top three things to improve reliability
Written by Kendall Miller on February 9, 2008 – 2:03 amQuick - what are the three things you should do to make the great improvement in the reliability and availability of the systems you’re responsible for?
Marketing for IT products and the general media tend to emphasize opportunities to purchase reliability. This makes sense because they’re in the business of selling things. Classic examples are the emphasis on extraordinarily redundant server hardware. A modern server can be purchased with redundant disks, redundant power supplies, redundant memory, and even in some extraordinary cases redundant processors. This is designed to let them prove that their server hardware has a staggeringly high mean time between failure, and who wants to be the IT manager that takes an outage because they didn’t purchase a reliability option they could have.
Before charging down the road of buying ever more elaborate hardware redundancy, let’s sit back and look at the big picture of where failures are coming from. Read more »
Tags: High Availability, IT Management, Process
Posted in Clustering, Infrastructure, Software Development | No Comments »