High-density single nucleotide polymorphism (SNP) platforms are currently used in genomic selection (GS) programs to enhance the selection response. However, the genotyping of a large number of animals with high-throughput platforms is rather expensive and may represent a constraint for a large-scale implementation of GS. The use of low-density marker (LDM) platforms could overcome this problem, but different SNP chips may be required for each trait and/or breed. In this study, a strategy of imputation independent from trait and breed is proposed. A simulated population of 5865 individuals with a genome of 6000 SNP equally distributed on six chromosomes was considered. First, reference and prediction populations were generated by mimicking highand low-density SNP platforms, respectively. Then, the partial least squares regression (PLSR) technique was applied to reconstruct the missing SNP in the low-density chip. The proportion of SNP correctly reconstructed by the PLSR method ranged from 0.78 to 0.97 when 90% and 50%, respectively, of genotypes were predicted. Moreover, data sets consisting of a mixture of actual and PLSR-predicted SNP or only actual SNP were used to predict genomic breeding values (GEBVs). Correlations between GEBV and true breeding values varied from 0.74 to 0.76, respectively. The results of the study indicate that the PLSR technique can be considered a reliable computational strategy for predicting SNP genotypes in an LDM platform with reasonable accuracy.

Use of partial least squares regression to predict single nucleotide polymorphism marker genotypes when some animals are genotyped with a low-density panel

GASPA, Giustino;
2011-01-01

Abstract

High-density single nucleotide polymorphism (SNP) platforms are currently used in genomic selection (GS) programs to enhance the selection response. However, the genotyping of a large number of animals with high-throughput platforms is rather expensive and may represent a constraint for a large-scale implementation of GS. The use of low-density marker (LDM) platforms could overcome this problem, but different SNP chips may be required for each trait and/or breed. In this study, a strategy of imputation independent from trait and breed is proposed. A simulated population of 5865 individuals with a genome of 6000 SNP equally distributed on six chromosomes was considered. First, reference and prediction populations were generated by mimicking highand low-density SNP platforms, respectively. Then, the partial least squares regression (PLSR) technique was applied to reconstruct the missing SNP in the low-density chip. The proportion of SNP correctly reconstructed by the PLSR method ranged from 0.78 to 0.97 when 90% and 50%, respectively, of genotypes were predicted. Moreover, data sets consisting of a mixture of actual and PLSR-predicted SNP or only actual SNP were used to predict genomic breeding values (GEBVs). Correlations between GEBV and true breeding values varied from 0.74 to 0.76, respectively. The results of the study indicate that the PLSR technique can be considered a reliable computational strategy for predicting SNP genotypes in an LDM platform with reasonable accuracy.
2011
5
6
833
837
genomic selection, SNP prediction, genotype imputation
DIMAURO, Corrado; STERI, R; PINTUS, M. A; GASPA, Giustino; MACCIOTTA, Nicolo' Pietro Paolo
File in questo prodotto:
File Dimensione Formato  
dimauro 2011.pdf

Accesso aperto

Tipo di file: POSTPRINT (VERSIONE FINALE DELL’AUTORE)
Dimensione 536.94 kB
Formato Adobe PDF
536.94 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1686986
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 7
  • ???jsp.display-item.citation.isi??? 5
social impact