Abstract:
The performance of load balancers is increasingly essential to distribute application traffic
across multiple instances efficiently while maintaining a large number of concurrent users
and application reliability. There is a myriad of factors that influence the performance of a
load balancer, and in this study, the impact of the concurrency model of server architectures
in performance is investigated in detail. In this research, how different server architectures
- thread-per-connection, reactor and disruptor - can be used to build load balancers was
studied and the strength and weaknesses of their concurrency model under a heavy concurrent
workload was analysed. Two different reference implementations for thread-per-connection,
with and without thread pool, were created to understand the impact of a thread pool. The
reactor architecture was implemented utilizing the HTTPCore-NIO library and the disruptor
architecture was developed using Netty transport and LMAX java library.
Besides, each implementation was extensively tuned and the performance of the best performing
load balancer in each server architecture was compared. Each chosen architecture of
load balancer has a distinct set of properties that control the performance, therefore, tuning
each implementation was treated as a separate effort. Through the benchmarking tool, JMeter,
extensive experiments will be conducted, and response time, throughput, CPU and memory
usage were measured to analyse the impact of server architecture on performance. This
study produced a comprehensive survey on several concurrency models of load balancer
architectures, an experimental illustration and a detailed analysis of load balancers, in terms
of performance, under high concurrent load. The results show with proper tuning, the peak
throughput of each load balancer is increased by more than 15% compared to its baseline
configuration.
Reactor based architecture with the configuration, 4 threads per reactor and 512 worker threads
produce the peak throughput of 4362 requests per second which is the highest throughput
produced in the experiments. In terms of throughout, the peak throughput produced by reactorbased
architecture is 19% better than the maximum throughput generated by disruptor based
architecture. The average response time increases exponentially for all load balancer architectures
but at different rates. Reactor based architecture produces the best response time for all
concurrent connections used in the experiments. A long-tail problem is completely visible in
thread-per-connection architecture under high concurrency, where it has a long and thick tail
in the response time distribution. The key factors affecting the performance of these architectures
are the non-blocking nature of I/O operations, CPU usage, handling contention for shared
resources, memory footprint and supporting a high number of concurrent connections.
Citation:
Thiruchittampalam, R. (2021). Comparing the performance of concurrency models of load balancer architectures [Master's theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.uom.lk/handle/123/20762