Load balancing a website
Nowadays, most applications are web-based, whether it’s a traditional web interface or a RESTful API. This first tier is typically set up for HA using a load balancer. A load balancer is a system that distributes incoming network traffic or workload across multiple servers or resources. Its main goal is to optimize resource utilization, improve performance, and ensure the reliability and availability of applications or services. When multiple servers are involved in serving a particular application or service, a load balancer acts as an intermediary between the client and the server pool. It receives incoming requests from clients and intelligently distributes them across the available servers based on various algorithms, such as round-robin, least connections, or weighted distribution.
The load balancer is responsible for ensuring the servers’ optimal health and performance by redirecting traffic from overloaded or problematic servers. This distribution of workloads helps to prevent any one server from getting overwhelmed, thus enhancing response time and overall system capacity and scalability.
Note
While a load balancer can help distribute the workloads, the actual server load is based on other factors, so do not expect all servers to have the same utilization of CPU, RAM, networking, and so on.
Load balancers not only distribute traffic but also offer advanced features, such as SSL termination, session persistence, caching, and content routing. They are extensively used in web applications, cloud-based services, and other environments that demand HA and scalability.
One of the most popular load balancers is HAProxy.
HAProxy is a great open source load-balancer option. Standing for High Availability Proxy, HAProxy is widely used due to its excellent performance and ability to improve the availability and scalability of applications. Operating at the application layer (Layer 7) of the OSI model, this software is able to make routing decisions based on specific application-level information, such as HTTP headers and cookies. Compared to traditional network-level (Layer 4) load balancers, HAProxy allows for more advanced load-balancing and traffic-routing capabilities.
Some key features and capabilities of HAProxy include the following:
- Load balancing: With HAProxy, incoming traffic can be evenly distributed across multiple servers using various algorithms, such as round-robin, least connections, and source IP, among others.
- High availability: One of the beneficial features of HAProxy is its ability to support active-passive failover setups. In the event the active server becomes unavailable, a standby server will take over. Additionally, it also has the capability to monitor the health of servers and make automatic adjustments to the load-balancing pool by adding or removing servers.
- Proxying: One of HAProxy’s primary functions is to act as a reverse proxy, which involves receiving client requests and directing them to the correct backend servers. Additionally, it can function as a forward proxy by intercepting client requests and directing them to external servers.
- SSL/TLS termination: With HAProxy, SSL/TLS encryption and decryption can be efficiently managed, taking the load off of the backend servers.
- Session persistence: HAProxy is capable of preserving session affinity by routing follow-up requests from a client to the same backend server, thus guaranteeing the proper operation of session-based applications.
- Health checks and monitoring: To guarantee the availability and optimal performance of backend servers, HAProxy conducts routine health checks. It has the ability to identify failed servers and promptly exclude them from the load-balancing pool.
- Logging and statistics: With HAProxy, administrators can effectively monitor and analyze traffic patterns, performance metrics, and error conditions. Its detailed logging and statistics feature makes this possible.
HAProxy can be deployed on various operating systems and is often used in high-traffic web environments, cloud infrastructure, and containerized deployments. Its versatility and extensive feature set make it a powerful tool for managing and optimizing application traffic and open source-based load balancers.
For this recipe, we will put HAProxy on one system (as the load balancer) and then two identical web servers to balance traffic to:
Figure 6.1 – HAProxy example diagram
Getting ready
To get started, we first need three servers. For this exercise, we will call them lb1, web1, and web2. They are identical systems, each with 8 GB RAM, 4 vCPUs, and 100 GB of drive space. The filesystems have 50 GB in /, 5 GB in /home, and 8 GB in swap. The remaining disk space is unallocated. You will also need the IP address for each host. In this example, the following IP addresses were used:
Host | IP |
Web1 | 192.168.56.200 |
Web2 | 192.168.56.201 |
Lb1 | 192.168.56.202 |
Table 6.3 – HAProxy IP addresses
Once the server is built, patch it to the current software with the following command:
dnf update -y
Once the software is patched, reboot the systems.