In stream processing, data arrives constantly and is often unpredictable. It can show large fluctuations in arrival frequency, size, complexity, and other factors. These fluctuations can strongly impact application latency and throughput, which are critical factors in this domain. Therefore, there is a significant amount of research on self-Adaptive techniques involving elasticity or micro-batching as a way to mitigate this impact. However, there is a lack of benchmarks and tools for helping researchers to investigate micro-batching and data stream frequency implications. In this paper, we extend a benchmarking framework to support dynamic micro-batching and data stream frequency management. We used it to create custom benchmarks and compare latency and throughput aspects from two different parallel libraries. We validate our solution through an extensive analysis of the impact of micro-batching and data stream frequency on stream processing applications using Intel TBB and FastFlow, which are two libraries that leverage stream parallelism on multi-core architectures. Our results demonstrated up to 33% throughput gain over latency using micro-batches. Additionally, while TBB ensures lower latency, FastFlow ensures higher throughput in the parallel applications for different data stream frequency configurations.

Evaluating Micro-batch and Data Frequency for Stream Processing Applications on Multi-cores

Adriano Marques Garcia
First
;
2022-01-01

Abstract

In stream processing, data arrives constantly and is often unpredictable. It can show large fluctuations in arrival frequency, size, complexity, and other factors. These fluctuations can strongly impact application latency and throughput, which are critical factors in this domain. Therefore, there is a significant amount of research on self-Adaptive techniques involving elasticity or micro-batching as a way to mitigate this impact. However, there is a lack of benchmarks and tools for helping researchers to investigate micro-batching and data stream frequency implications. In this paper, we extend a benchmarking framework to support dynamic micro-batching and data stream frequency management. We used it to create custom benchmarks and compare latency and throughput aspects from two different parallel libraries. We validate our solution through an extensive analysis of the impact of micro-batching and data stream frequency on stream processing applications using Intel TBB and FastFlow, which are two libraries that leverage stream parallelism on multi-core architectures. Our results demonstrated up to 33% throughput gain over latency using micro-batches. Additionally, while TBB ensures lower latency, FastFlow ensures higher throughput in the parallel applications for different data stream frequency configurations.
2022
30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2022
Valladolid, Spain
2022
Proceedings of the 30th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2022
IEEE
10
17
978-1-6654-6958-6
https://www.computer.org/csdl/proceedings/pdp/2022/1CFRSRKyZbO
Benchmark; FastFlow; Parallel programming; Performance analysis; Stream Parallelism; TBB
Adriano Marques Garcia; Dalvan Griebler; Claudio Schepke; Luiz Gustavo L. Fernandes
File in questo prodotto:
File Dimensione Formato  
Evaluating_Micro-batch_and_Data_Frequency_for_Stream_Processing_Applications_on_Multi-cores.pdf

Accesso riservato

Tipo di file: PDF EDITORIALE
Dimensione 1.07 MB
Formato Adobe PDF
1.07 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
PDP_2022__SPBench_with_Batch_and_Data_Frequency_.pdf

Accesso aperto

Tipo di file: PREPRINT (PRIMA BOZZA)
Dimensione 406.1 kB
Formato Adobe PDF
406.1 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1949990
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 3
social impact