Purpose of this paper is to investigate the use of the Integrated Square Error, or L2 distance, as a practical estimation tool in building useful re- gression models. Exploiting its properties of robustness, we shall see how it can be particularly helpful in all those situations involving the study of large data sets where regression models based on M -estimators are likely to be unstable due to the presence of of a substantial number of outliers or clustered data. We propose a technique of regression analysis which consists in comparing the results arising from L2 estimates with the ones obtained applying some common M -estimators. The discrepancy between the estimated regression models is measured resorting to a new concept of similarity between functions and a system of statistical hypothesis, based on Monte Carlo Significance test, is introduced to verify the similarity of the estimates. Theory is outlined and a case study, based on Health Professionals Follow- Up Study (Harvard School of Public Health), in estimating the waist circumference as predictor of type 2 diabetes risk is presented.

Regression Models and Cluster Detection. An Application to Anthropometric Measurements

ISAIA, Ennio Davide;DURIO, Alessandra
2007-01-01

Abstract

Purpose of this paper is to investigate the use of the Integrated Square Error, or L2 distance, as a practical estimation tool in building useful re- gression models. Exploiting its properties of robustness, we shall see how it can be particularly helpful in all those situations involving the study of large data sets where regression models based on M -estimators are likely to be unstable due to the presence of of a substantial number of outliers or clustered data. We propose a technique of regression analysis which consists in comparing the results arising from L2 estimates with the ones obtained applying some common M -estimators. The discrepancy between the estimated regression models is measured resorting to a new concept of similarity between functions and a system of statistical hypothesis, based on Monte Carlo Significance test, is introduced to verify the similarity of the estimates. Theory is outlined and a case study, based on Health Professionals Follow- Up Study (Harvard School of Public Health), in estimating the waist circumference as predictor of type 2 diabetes risk is presented.
2007
International Symposium on Business & Industrial Statistics
Ponta Delgada (Azores), Portugal
18-20 August, 2007
Proceedings of the International Symposium on Business & Industrial Statistics
ISBIS07
-
-
9789899548909
M -estimators; Minimum Integrated Square Error; Monte Carlo Significance Test; Robust Regression
E. ISAIA; A. DURIO
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/35219
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact