In the last years, Deep learning (DL) has become an active research topic in the field of medical image analysis, in particular for the automatic segmentation of pathological volumes. In order to develop a robust and generalizable system, it is of crucial importance to define the most suitable training set, according to both the model and the aim. Nevertheless, there are still no common guidelines specifying the most appropriate sampling and dimensionality of the set. The aim of the study is to assess how different sampling methods, e.g., stratified and random, and different sizes of the training set affect the performances of DL models for automatic segmentation. All DL algorithms were based on a U-Net structure and were trained using a real-world multi-center and multi-scanner pelvic MRI database. The performances were evaluated and compared using the Dice Similarity Coefficient between manual and automatic masks and the number of false negatives obtained by the different algorithms. Our results suggest that if the size of the training set is sufficiently large, using a stratified approach based on dendrograms does not strongly affect the performances of the nets, otherwise leads to higher results. Further analysis is needed using different stratification methods and sample sizes.
Comparison between different approaches for the creation of the training set: how clustering and dimensionality impact the performance of a Deep Learning model
Panic J.;Defeudis A.;Regge D.;Giannini V.
2023-01-01
Abstract
In the last years, Deep learning (DL) has become an active research topic in the field of medical image analysis, in particular for the automatic segmentation of pathological volumes. In order to develop a robust and generalizable system, it is of crucial importance to define the most suitable training set, according to both the model and the aim. Nevertheless, there are still no common guidelines specifying the most appropriate sampling and dimensionality of the set. The aim of the study is to assess how different sampling methods, e.g., stratified and random, and different sizes of the training set affect the performances of DL models for automatic segmentation. All DL algorithms were based on a U-Net structure and were trained using a real-world multi-center and multi-scanner pelvic MRI database. The performances were evaluated and compared using the Dice Similarity Coefficient between manual and automatic masks and the number of false negatives obtained by the different algorithms. Our results suggest that if the size of the training set is sufficiently large, using a stratified approach based on dendrograms does not strongly affect the performances of the nets, otherwise leads to higher results. Further analysis is needed using different stratification methods and sample sizes.File | Dimensione | Formato | |
---|---|---|---|
Comparison_between_Different_Approaches_for_the_Creation_of_the_Training_Set_How_Clustering_and_Dimensionality_Impact_the_Performance_of_a_Deep_Learning_Model.pdf
Accesso riservato
Dimensione
911.28 kB
Formato
Adobe PDF
|
911.28 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.