Weighted Round-Robin Distribution
The round-robin algorithm is often used as a simple-yet-effective method of distributing requests from a single point of entry to multiple servers in the background. It's used by DNS servers, peer-to-peer networks, and many other multiple-node clusters/networks.
In a nutshell, round-robin algorithms pair an incoming request to a specific machine by cycling (or, more specifically, circling) through a list of servers capable of handling the request. It's a common solution to many network load balancing needs, even though it does not result in a perfectly balanced load distribution, strictly speaking. In the non-ideal world our servers live in, there are many reasons why the stock round-robin algorithm just isn't good enough when it comes to properly balancing server loads.
The first and most important thing to keep in mind is that not all servers are created equal. One should be able to take advantage of all available resources, and it's impossible to guarantee that all the servers available to process incoming requests are capable of dealing with the same load quantities, take as long to carry out each command, and deal with larger/longer queues as elegantly. Nor can all requests be treated the same, either. Some take longer to process than others, involve more work, and are generally more demanding than the rest – just as others are finished relatively fast and with far fewer resources.
Weighted round-robin is one way of addressing these shortcomings. In particular, it provides a clean and effective way of solving the first half of the problem by focusing on fairly distributing the load amongst available resources, versus attempting to equally distribute the requests.
In the server environment, a difference in server capacities and processing capabilities can result in a more marked difference in the performance of the different servers (read: less balanced) when compared to the effect of a difference in the requests themselves.
In a weighted round-robin algorithm, each destination (in this case, server) is assigned a value that signifies, relative to the other servers in the list, how that server performs. This "weight" determines how many more (or fewer) requests are sent that server's way compared to the other servers on the list.
Take, for example, the case of web servers. Assume you have three servers that have been individually benchmarked and configured to be deployed in a weighted round-robin environment. The first can handle 100 req/sec, the second can do 300 req/sec, and the last can only do 25 req/sec (all on average, tested with an automated benchmark serving the same data). Normally, with such a discrepancy in the servers' performance, the third would be excluded from the setup entirely. But in a weighted round-robin, each server can be assigned as much as it can handle in the round-robin configuration script/file:
| Resource | Weight |
|---|---|
| server1 | 4 |
| server2 | 12 |
| server3 | 1 |
It's pretty clear what happens next: for every 12 requests sent to server two, 4 will be sent to server one, and just 1 will be sent to server three. The result is a more even, if less equal, load distribution.