Abstract:
With the introduction of Internet of Things (IoT), scalable Complex Event
Processing (CEP) and stream processing on memory, CPU, and bandwidth constraint
infrastructure have become essential. While several related work focuses on
replication of CEP engines to enhance scalability, they do not provide expected
performance while scaling stateful queries for event streams that do not have predefined
partitions. Most of the CEP systems provide scalability for stateless queries or
for the stateful queries where the event streams can be partitioned based on one or
more event attributes. These systems can only scale up to the pre-defined number of
partitions, limiting the number of events they can process. Meanwhile, some CEP
systems do not support cloud-native and microservices features such as startup time in
milliseconds.
In this research, we address the scalability of CEP systems for stateful
operators such as windows, joins, and pattern by scaling data processing nodes and
connecting them as a directed acyclic graph. This enabled us to scale the processing
and working memory using the scatter and gather based approach. We tested the
proposed technique by implementing it using a set of Siddhi CEP engines running on
Docker containers managed by Kubernetes container orchestration system. The tests
were carried out for a fixed data rate, on uniform capacity nodes, to understand the
processing capacity of the deployment. As we scale the nodes, for all cases, the
proposed system was able to scale almost linearly while producing zero errors for
patterns, 0.1% for windows, and 6.6% for joins, respectively. By reordering events the
error rate of window and join queries was reduced to 0.03% and 1% while introducing
54ms and 260ms of delays, respectively.
Citation:
Suhothayan, S. (2019). Scatter-gather based approach in scaling complex event processing systems for stateful operators [Master’s theses, University of Moratuwa]. Institutional Repository University of Moratuwa. http://dl.lib.mrt.ac.lk/handle/123/16177