This study deals with the evaluation of inter-observer reliability (IOR) among three raters in the case of dichotomous and trichotomous individual animal-based welfare indicators. The performance of the most documented agreement indices proposed in the literature was compared, using udder asymmetry (UA) as a dichotomous indicator and body condition score (BCS) as a trichotomous indicator, both obtained from the AWIN Goat protocol. Nine dairy goat farms, exploiting three alpine pastures (AP1 to AP3), were used for data collection. Krippendorff’s α, the agreement indices belonging to the Kappa statistic and their weighted forms were in some cases affected by the paradox behaviour. This phenomenon was observed for both UA and BCS [e.g., P0(BCS-AP2) = 80%; Fleiss’ K = 0.22]. In the case of UA, Gwet’s γ(AC1), followed by BP coefficient and Quatto’s S, gave the best agreement results [e.g., P0(UA-AP1) = 86%; γ(AC1) = 0.84]. In the case of BCS, the best agreement results were obtained with Gwet’s γ(AC2), followed by the weighted forms of BP and S. When the evaluation is performed by three raters, γ(AC1), BP and S are suggested to evaluate IOR in the case of both dichotomous and trichotomous indicators, while the related weighted forms are suitable for trichotomous indicators only.
Comparing agreement indices to assess inter-observer reliability in the case of dichotomous and trichotomous animal-based welfare indicators with three raters
Benedetta Torsiello;Luca Battaglini;Manuela Renna
2026-01-01
Abstract
This study deals with the evaluation of inter-observer reliability (IOR) among three raters in the case of dichotomous and trichotomous individual animal-based welfare indicators. The performance of the most documented agreement indices proposed in the literature was compared, using udder asymmetry (UA) as a dichotomous indicator and body condition score (BCS) as a trichotomous indicator, both obtained from the AWIN Goat protocol. Nine dairy goat farms, exploiting three alpine pastures (AP1 to AP3), were used for data collection. Krippendorff’s α, the agreement indices belonging to the Kappa statistic and their weighted forms were in some cases affected by the paradox behaviour. This phenomenon was observed for both UA and BCS [e.g., P0(BCS-AP2) = 80%; Fleiss’ K = 0.22]. In the case of UA, Gwet’s γ(AC1), followed by BP coefficient and Quatto’s S, gave the best agreement results [e.g., P0(UA-AP1) = 86%; γ(AC1) = 0.84]. In the case of BCS, the best agreement results were obtained with Gwet’s γ(AC2), followed by the weighted forms of BP and S. When the evaluation is performed by three raters, γ(AC1), BP and S are suggested to evaluate IOR in the case of both dichotomous and trichotomous indicators, while the related weighted forms are suitable for trichotomous indicators only.| File | Dimensione | Formato | |
|---|---|---|---|
|
animals_IOR.pdf
Accesso aperto
Descrizione: Paper IOR
Tipo di file:
PDF EDITORIALE
Dimensione
307.44 kB
Formato
Adobe PDF
|
307.44 kB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



