Webfarms II: Balancing The Load

Okay, so you understand webfarms now. What's the magic that actually distributes the load, and how does it determine how the distribution is handled?

At ORCS Web we use the Foundry Server Iron products to perform our webfarm
load-balancing. If one of them fails, the other instantly takes over (In our
testing, it had sub-second fail-over!)

So what is this "Server Iron" thing? In simplest terms, it's a layer
4-7 switch. It has multiple network ports on it and can be used literally like
other types of switches. But, it can also load-balancing and traffic distribution.
A VIP (virtual IP) can be assigned to the SI (Server Iron) and it then handles
all traffic sent to that address/VIP. Further configuration is done to tell
the SI what to actually do with the traffic sent to the VIP address.

The traffic that hits the VIP on the Server Iron is of course redistributed
to a number of server nodes so the client request can be satisfied - that's
the whole point of a webfarm. If one or more server nodes are not responding,
the switches are able to detect this and send all new requests to servers that
are still online - making the failure of a server node almost transparent to
the client.

The traffic can be distributed based on a couple different logic algorithms.
The most common are:

  • Round Robin: The switches send requests to each server in rotation, regardless
    of how many connections each server has or how fast it may reply.
  • Fastest response: The switches select the server node with the fastest response
    time and sends new connection requests to that server.
  • Least connections: The switches send traffic to whichever server node shows
    as having the fewest active connections.
  • Active-passive: This is called Local/Remote on a Foundry switch, but is
    still basically active/passive. This allows one or more servers to be designated
    as "local" which marks them as primary for all traffic. This is
    combined with another method above to determine what order the "local"
    server nodes have requests sent to them. If a situation were to arise that
    all the "local" (active) server nodes were down, then traffic would
    be sent to the "remote" server nodes. Note that "remote"
    in this case doesn't really have to mean remote - the "remote" server
    could be sitting right next to the "local" servers but it is marked
    as remote in the configuration to let it operate as a hot-standby server.
    This setting can also be used in a true remote situation where there are servers
    in a different physical data center - perhaps for extreme disaster recovery
    situations.

What method is best? It really depends on your application and some other surrounding
factors. Each method is good though and would probably satisfy requirements
regardless of the configuration. Especially if you are closely monitoring each
server node with an external tool (other than directly from the load-balancing
switch). So, with the external monitoring you can confirm that all server nodes
are operating without error and within reasonable speed thresholds that have
been set.

Also, remember that, regardless which traffic algorithm is chosen, if a node
goes down, traffic is sent to the other nodes. And when a node comes back online,
it can automatically be placed back into the webfarm and start getting client
requests again.

Clustered hosting does require some consideration of how state is managed within
applications, which will be covered in a future article.

Happy hosting!


By Brad Kingsley

Brad Kingsley is President and Founder of ORCS
Web, Inc.
- a company that provides managed hosting services for clients
who develop and deploy their applications on Microsoft Windows platforms.