5 minute read

Fundamentals of system design Chapter 3: Load Balancers

The components of system design

Load Balancers

When the number of requests in an application increases, it can overload a server which affects system performance. A single server has limited throughput and resources. Take for example an online marketplace like Amazon. During black Fridays or Christmas season, it experiences an unusual surge in traffic. It’s only a matter of seconds before the server gets overloaded, therefore, there is a need to scale to effectively handle the increased demand. Scaling can be done in two ways, vertically or horizontally. In order to scale horizontally, there is need for a load balancer. A load balancer is a device that is used to distribute application traffic across a number of servers. It improves the overall performance of a system by distributing the traffic to different servers therefore decreasing the burden on a single server. A load balancer sits between clients and servers. It routes clients requests between servers, ensuring that no single server is overworked which could make an application unavailable and unreliable.

Hardware vs Software Load Balancing

Load balancers typically comes in two flavours: software-based or hardware-based. Hardware-based load balancers are physical devices which often comes with specialized processors and propriety software customized for load balancing. On the other hand, software-based load balancers run on commodity hardware where you can install any type of software load-balancer which makes them less expensive and more flexible.

Load Balancing Algorithms

Effective load balancers will intelligently determine which server should process a user request in a server pool by using different algorithms. A load balancing algorithm is the logic that a load balancer uses to distribute incoming traffic to between servers. The following are examples of load balancing algorithms:

a) Round Robin - The load balancers queues the client requests and directs them in a round-robin fashion. First request goes to first server, second goes to second server and so on. When the load balancer comes to the end of the list, It directs a request back to the first server. Round-robin approach is easy to implement and evenly distributes the traffic across all servers. However there is a risk that a server with low capacity receives many requests and becomes overloaded because the algorithm does not consider the server’s capacity. The algorithms works well in a server pool that has the same processing power.

b) Weighted Round Robin - This algorithm is an advanced version of round-robin Algorithm. It distributes the traffic based on the weight scores of the servers. For instance, If server one is as twice as powerful as server two and three, server one is provisioned with a higher weight than server two and three. When there are 5 sequential client requests, the load balancer will route 2 requests to server one, 1 request to server two and three each, and the last request will be routed to server one again. The bigger the server, the more the requests to handle.

c) IP Hash Algorithm - The client and destination IP addresses are hashed to generate a unique hash key which is used to allocate a user to a specific server. The key can be regenerated if a session is broken and the user will be redirected back to the server. It has the advantage of caching, as the server caches data for that specific user. This algorithm is appropriate in scenarios where it’s vital for a client to return to same server for each successive connections.

d) Least Connection - This algorithm checks which server has the fewest current connections opened and sends traffic to that server. It assumes all the servers have equal processing power just like round-robin approach.

e) Weighted Least Connection - This algorithm is an advanced least connection method where you can assign different weights on the servers depending on the processing power. The algorithm will make decisions on where to route the traffic depending on active connections and weights of servers. If there are two servers with the least numbers of connection, the server with the highest weight is chosen.

f) Weighted Response Time - It averages the response times for all the servers with the number of active connections each server has to determine where to route the request. The algorithm ensures faster service for user by determining the server with the quickest response time.

g) Random Algorithm - This algorithm uses random number generator to distribute the client requests randomly to the servers. The algorithms assumes the servers have similar configurations.

Benefits of load balancers

a) Scalability - Enables an application to handle a traffic spike effectively, maintaining a smooth operation and fast responses to clients. This enables an application to be highly available and reliable.

b) Fault Tolerance - A single point of failure can be eliminated by having multiple servers in your infrastructure. When one server fails, a load balancer will route traffic to an available server. This way, redundancy can be achieved.

c) Avoiding Downtime - A load balancer will enable you to perform server maintenance without incurring downtime by automatically routing traffic to other available servers. This way you can reduce application downtime and improve system availability.

d) Improved Security - A load balancer can mitigate against DDos attacks. It will route traffic across the servers in case of a traffic surge. This will protect your application availability, giving the load balancer time to determine whether a spike in traffic is legitimate or not, offering a traffic scrubbing effect by blocking malicious request.

e) SSL Decryption - A load balancer can handle any incoming HTTPS connections, decrypting the requests and passing the unencrypted requests on to the web servers. This eliminates the need to install SSL certificates in different back end web servers by providing a single point of configuration. It also takes the processing load of encryption and decryption away from the web servers.

Load Balancers vs Reverse Proxy

Both are components which sits between clients and servers, accepting requests from former and delivering responses from the latter. The two are mostly similar, however, a load balancer is commonly deployed when an application needs multiple servers for scalability. A reverse proxy can also be used even in cases where you have a single server in place.


Load Balancers is a key component in improving the performance of a system. It ensures high application availability and reliability by ensuring no single server gets overloaded giving clients a good user experience.

Thank You!

I’d love to keep in touch! Feel free to follow me on Twitter at @codewithfed. If you enjoyed this article please consider sponsoring me for more. Your feedback is welcome anytime. Thanks again!


Leave a comment