Coordinated multi-robot systems are an effective way to harvest data from sensor networks and to implement active perception strategies. However, achieving efficient coordination in a way which guarantees a target QoS while adapting dynamically to changes (in the environment, due to sensors' mobility, and/or in the value of harvested data) is to date a key open issue. In this paper, we propose a novel decentralized Monte Carlo Tree Search algorithm (MCTS) which allows agents to optimize their own actions while achieving some form of coordination, in a changing environment. Its key underlying idea is to balance in an adaptive manner the exploration-exploitation trade-off to deal effectively with abrupt changes caused by the environment and random changes caused by other agents' actions. Critically, outdated and irrelevant samples - an inherent and prevalent feature in all multi-agent MCTS-based algorithms - are filtered out by means of a sliding window mechanism. We show both theoretically and through simulations that our algorithm provides a log-factor (in terms of time steps) smaller regret than state-of-the-art decentralized multi-agent planning methods. We instantiate our approach on the problem of underwater data collection, showing on a set of different models for changes that our approach greatly outperforms the best available algorithms for that setting, both in terms of convergence speed and of global utility.
Multi-Agent Data Collection in Non-Stationary Environments
Rizzo G.;
2022-01-01
Abstract
Coordinated multi-robot systems are an effective way to harvest data from sensor networks and to implement active perception strategies. However, achieving efficient coordination in a way which guarantees a target QoS while adapting dynamically to changes (in the environment, due to sensors' mobility, and/or in the value of harvested data) is to date a key open issue. In this paper, we propose a novel decentralized Monte Carlo Tree Search algorithm (MCTS) which allows agents to optimize their own actions while achieving some form of coordination, in a changing environment. Its key underlying idea is to balance in an adaptive manner the exploration-exploitation trade-off to deal effectively with abrupt changes caused by the environment and random changes caused by other agents' actions. Critically, outdated and irrelevant samples - an inherent and prevalent feature in all multi-agent MCTS-based algorithms - are filtered out by means of a sliding window mechanism. We show both theoretically and through simulations that our algorithm provides a log-factor (in terms of time steps) smaller regret than state-of-the-art decentralized multi-agent planning methods. We instantiate our approach on the problem of underwater data collection, showing on a set of different models for changes that our approach greatly outperforms the best available algorithms for that setting, both in terms of convergence speed and of global utility.| File | Dimensione | Formato | |
|---|---|---|---|
|
SW_MCTS_paper (6).pdf
Accesso aperto
Tipo di file:
POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione
1.9 MB
Formato
Adobe PDF
|
1.9 MB | Adobe PDF | Visualizza/Apri |
|
Multi-Agent_Data_Collection_in_Non-Stationary_Environments.pdf
Accesso riservato
Tipo di file:
PDF EDITORIALE
Dimensione
1.22 MB
Formato
Adobe PDF
|
1.22 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



