In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models - for which only informal (and often confusing) semantics is generally provided - all share a common underlying model, namely, the DataFlow model. The model we propose shows how various tools share the same expressiveness at different levels of abstraction. The contribution of this work is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm), thus making it easier to understand high-level data-processing applications written in such frameworks. Second, we provide a layered model that can represent tools and applications following the DataFlow paradigm and we show how the analyzed tools fit in each level.

A Comparison of Big Data Frameworks on a Layered Dataflow Model

MISALE, CLAUDIA;DROCCO, MAURIZIO;ALDINUCCI, MARCO;
2017-01-01

Abstract

In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models - for which only informal (and often confusing) semantics is generally provided - all share a common underlying model, namely, the DataFlow model. The model we propose shows how various tools share the same expressiveness at different levels of abstraction. The contribution of this work is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm), thus making it easier to understand high-level data-processing applications written in such frameworks. Second, we provide a layered model that can represent tools and applications following the DataFlow paradigm and we show how the analyzed tools fit in each level.
2017
1
20
http://www.worldscientific.com/doi/abs/10.1142/S0129626417400035
Misale, Claudia; Drocco, Maurizio; Aldinucci, Marco; Tremblay, Guy
File in questo prodotto:
File Dimensione Formato  
preprintPPL_4aperto.pdf

Accesso aperto

Descrizione: Articolo principale preprint
Tipo di file: PREPRINT (PRIMA BOZZA)
Dimensione 722.39 kB
Formato Adobe PDF
722.39 kB Adobe PDF Visualizza/Apri
published.pdf

Accesso riservato

Descrizione: Articolo principale editoriale
Tipo di file: PDF EDITORIALE
Dimensione 548.43 kB
Formato Adobe PDF
548.43 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1626287
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 9
social impact