With the increasing amount of digital data available for analysis and simulation, the class of I/O-intensive HPC workflows is fated to quickly expand, further exacerbating the performance gap between computing, memory, and storage technologies. This paper introduces CAPIO (Cross-Application Programmable I/O), a middleware capable of injecting I/O streaming capabilities into file-based workflows, improving the computation-I/O overlap without the need to change the application code. The contribution is twofold: 1) at design time, a new I/O coordination language allows users to annotate workflow data dependencies with synchronization semantics; 2) at run time, a user-space middleware automatically and transparently to the user turns a workflow batch execution into a streaming execution according to the semantics expressed in the configuration file. CAPIO has been tested on synthetic benchmarks simulating typical workflow I/O patterns and two real-world workflows. Experiments show that CAPIO reduces the execution time by 10% to 66% for data-intensive workflows that use the file system as a communication medium.

CAPIO: a Middleware for Transparent I/O Streaming in Data-Intensive Workflows

Alberto Riccardo Martinelli
First
;
Marco Aldinucci;Iacopo Colonnelli;Barbara Cantalupo
2023-01-01

Abstract

With the increasing amount of digital data available for analysis and simulation, the class of I/O-intensive HPC workflows is fated to quickly expand, further exacerbating the performance gap between computing, memory, and storage technologies. This paper introduces CAPIO (Cross-Application Programmable I/O), a middleware capable of injecting I/O streaming capabilities into file-based workflows, improving the computation-I/O overlap without the need to change the application code. The contribution is twofold: 1) at design time, a new I/O coordination language allows users to annotate workflow data dependencies with synchronization semantics; 2) at run time, a user-space middleware automatically and transparently to the user turns a workflow batch execution into a streaming execution according to the semantics expressed in the configuration file. CAPIO has been tested on synthetic benchmarks simulating typical workflow I/O patterns and two real-world workflows. Experiments show that CAPIO reduces the execution time by 10% to 66% for data-intensive workflows that use the file system as a communication medium.
2023
Inglese
contributo
1 - Conferenza
International Conference on High Performance Computing
Goa, India
18-21 Dicembre 2023
Internazionale
30th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC)
Comitato scientifico
IEEE
Piscataway, NJ
STATI UNITI D'AMERICA
153
163
11
979-8-3503-8322-5
Workflow, In situ model, I/O coordination
no
   Third Party CINI - "ADMIRE - Adaptive multi-tier intelligent data manager for Exascale" - Call H2020-JTI-EuroHPC-2019-1 - Grant Agreement n. 956748 - CUP F69J21003450007
   ADMIRE
   EUROPEAN COMMISSION
   H2020
   ALDINUCCI M.-H2020 RIA - G.A. 956748

   Third Party CINI - "EUPEX - EUROPEAN PILOT FOR EXASCALE" (H2020-JTI-EuroHPC-2020-1)
   EUPEX
   EUROPEAN COMMISSION
   H2020
   ALDINUCCI M. - H2020 RIA G.A. n. 101033975

   Future HPC & Big Data-finanziato con fondi PNRR MUR-M4C2-Investimento 1.4-Avviso"Centri Nazionali"-D.D.n.3138 del 16/12/2021 rettificato con DD n.3175 del 18/12/2021,codice MUR CN00000013, CUP D13C22001340001
   CN-HPC
   Ministero dell'Università e della Ricerca
   ALDINUCCI M.- CN-HPC
1 – prodotto con file in versione Open Access (allegherò il file al passo 6 - Carica)
5
info:eu-repo/semantics/conferenceObject
04-CONTRIBUTO IN ATTI DI CONVEGNO::04A-Conference paper in volume
Alberto Riccardo Martinelli, Massimo Torquati, Marco Aldinucci, Iacopo Colonnelli, Barbara Cantalupo
273
partially_open
File in questo prodotto:
File Dimensione Formato  
CAPIO.pdf

Accesso riservato

Descrizione: PDF Editoriale
Tipo di file: PDF EDITORIALE
Dimensione 731.41 kB
Formato Adobe PDF
731.41 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
CAPIO-HiPC23-preprint.pdf

Accesso aperto

Tipo di file: PREPRINT (PRIMA BOZZA)
Dimensione 1.04 MB
Formato Adobe PDF
1.04 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1948632
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? 4
social impact