In this paper, we present a new C++ API with a fluent interface called PiCo (Pipeline Composition). PiCo's programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance. This is attained through three key design choices: 1) unifying batch and stream data access models, 2) decoupling processing from data layout, and 3) exploiting a stream-oriented, scalable, efficient C++11 runtime system. PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to re-use the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc.). Preliminary results show that PiCo can attain better performances in terms of execution times and hugely improve memory utilization when compared to Spark and Flink in both batch and stream processing.

PiCo: a Novel Approach to Stream Data Analytics

Claudia Misale;Maurizio Drocco;Guy Tremblay;Marco Aldinucci
2018-01-01

Abstract

In this paper, we present a new C++ API with a fluent interface called PiCo (Pipeline Composition). PiCo's programming model aims at making easier the programming of data analytics applications while preserving or enhancing their performance. This is attained through three key design choices: 1) unifying batch and stream data access models, 2) decoupling processing from data layout, and 3) exploiting a stream-oriented, scalable, efficient C++11 runtime system. PiCo proposes a programming model based on pipelines and operators that are polymorphic with respect to data types in the sense that it is possible to re-use the same algorithms and pipelines on different data models (e.g., streams, lists, sets, etc.). Preliminary results show that PiCo can attain better performances in terms of execution times and hugely improve memory utilization when compared to Spark and Flink in both batch and stream processing.
2018
Inglese
contributo
4 - Workshop
Euro-Par Workshops: 1st Intl. Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP)
Santiago de Compostela
29/08/2017
Internazionale
Proc. of Euro-Par Workshops: 1st Intl. Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP)
Esperti anonimi
Springer
New York
STATI UNITI D'AMERICA
10659
1
12
12
978-3-319-75178-8
978-3-319-75177-1
STATI UNITI D'AMERICA
CANADA
   Rephrase
   H2020
1 – prodotto con file in versione Open Access (allegherò il file al passo 6 - Carica)
4
info:eu-repo/semantics/conferenceObject
04-CONTRIBUTO IN ATTI DI CONVEGNO::04A-Conference paper in volume
Claudia, Misale; Maurizio, Drocco; Guy, Tremblay; Marco, Aldinucci
273
partially_open
File in questo prodotto:
File Dimensione Formato  
autodasp.pdf

Accesso aperto

Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione 395.84 kB
Formato Adobe PDF
395.84 kB Adobe PDF Visualizza/Apri
2018_pico_autodasp.pdf

Accesso riservato

Descrizione: pdf editoriale
Tipo di file: PDF EDITORIALE
Dimensione 212.81 kB
Formato Adobe PDF
212.81 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1659344
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 2
social impact