This work introduces a novel double-sided streaming methodology that combines control-plane and data-plane streaming. Our goal is to implement the long-advocated separation of concerns in workflow orchestration without introducing artificial boundaries in their execution. Our approach is exemplified by the integration of control-plane streaming provided by dispel4py and the transparent data-plane streaming provided by CAPIO. Our integration eliminates file synchronization barriers without requiring modifications to existing workflow logic. To support this, we extend CAPIO with a new commit rule that allows streaming over dynamically generated file sets, enabling hybrid workflows that blend in-memory dataflows with file-based communication. We validate our approach using a real-world seismic cross-correlation workflow, achieving performance improvements between 23% and 40%. Unlike previous solutions, our method supports streaming across the entire workflow, including phase boundaries where file I/O would typically enforce strict execution ordering. Therefore, our approach can be straightforwardly extended to other multi-stage streaming applications.
Overcoming Dynamic I/O Boundaries: a Double-Sided Streaming Methodology with dispel4py and CAPIO
Santimaria M. E.;Medic D.;Colonnelli I.;Aldinucci M.
2025-01-01
Abstract
This work introduces a novel double-sided streaming methodology that combines control-plane and data-plane streaming. Our goal is to implement the long-advocated separation of concerns in workflow orchestration without introducing artificial boundaries in their execution. Our approach is exemplified by the integration of control-plane streaming provided by dispel4py and the transparent data-plane streaming provided by CAPIO. Our integration eliminates file synchronization barriers without requiring modifications to existing workflow logic. To support this, we extend CAPIO with a new commit rule that allows streaming over dynamically generated file sets, enabling hybrid workflows that blend in-memory dataflows with file-based communication. We validate our approach using a real-world seismic cross-correlation workflow, achieving performance improvements between 23% and 40%. Unlike previous solutions, our method supports streaming across the entire workflow, including phase boundaries where file I/O would typically enforce strict execution ordering. Therefore, our approach can be straightforwardly extended to other multi-stage streaming applications.| File | Dimensione | Formato | |
|---|---|---|---|
|
3731599.3767577.pdf
Accesso aperto
Descrizione: PDF Editoriale
Tipo di file:
PDF EDITORIALE
Dimensione
1.18 MB
Formato
Adobe PDF
|
1.18 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



