Abstract:
In the last few years, we have seen much interest in data processing technologies. Although initial interests focused on batch processing technologies like MapReduce, people have realized the need for more responsive technologies such as stream processing and complex event processing. Complex event processing has been historically used within a single node or a cluster of tightly interconnected nodes. However, to be effective with Big Data use-cases, CEP technologies need to be able to scale up to handle large use-cases. This paper presents several approaches to scale complex event processing by distributing it across several nodes. Wihidum discusses how to balance the workload among nodes efficiently, how complex event processing queries can be broken up into simple sub queries, and how queries can be efficiently deployed in the cluster. The paper focuses on three techniques used for scaling queries: pipelining, partitioning and distributed operators. Then it discusses in detail the distribution of few CEP operators: filters, joins, pattern matching, and partitions. Empirical results show that the techniques followed in Wihidum have improved the performance of the CEP solution.
Citation:
Jayasekara, S., Kannangara, S., Dahanayakage, T., Ranawaka, I., Perera, S., & Nanayakkara, V. (2015). Wihidum: Distributed complex event processing. Journal of Parallel and Distributed Computing, 79–80, 42–51. https://doi.org/10.1016/j.jpdc.2015.03.002