Genome annotation makes it possible to identify the coding and non-coding regions of a genome, such as exons-introns, regulatory elements, repeats as well as gene functions and locations. The newly developed eggplant genome sequence (see Chap. 7) was masked using RepeatMasker, by combining homology-based and de novo approaches, and ~73% of the eggplant genome was found to include transposable elements (TEs). In total, 34,916 protein-coding genes were predicted, confirming that the diploid gene number in the Solanaceae is around 35,000, as previously reported for tomato (Solanum lycopersicum L.), potato (S. tuberosum L.) and pepper (Capsicum spp.). A total of 108,360 protein sequences from eggplant, pepper and potato were clustered into 22,337 gene families (excluding singletons) using OrthoMCL, with 12,568 gene families (comprising 76,920 genes) in common between the four Solanaceae crops, while 674 eggplant-specific clusters containing 1999 genes were identified. The high-quality eggplant genome sequence offers the possibility to perform comparative genomic studies within species, in order to find variation across individuals for genetic association and linkage analyses, as well as between species, with the goal to perform evolutionary studies. Furthermore, it provides a key resource for the understanding the Solanaceae biology and a key tool for future breeding programmes. The newly developed eggplant genome was also surveyed for the identification of single-locus SSR markers and nearly 133,000 perfect SSRs, a density of 125.5 SSRs/Mbp, as well as about 178,400 imperfect SSRs were identified. Using these data, a public dynamic microsatellite database was developed (www.eggplantmicrosatellite.org), which represents a one-stop resource for the global community of scientists and breeders.

Genome Annotation

Sergio Lanteri;Lorenzo Barchi
2019-01-01

Abstract

Genome annotation makes it possible to identify the coding and non-coding regions of a genome, such as exons-introns, regulatory elements, repeats as well as gene functions and locations. The newly developed eggplant genome sequence (see Chap. 7) was masked using RepeatMasker, by combining homology-based and de novo approaches, and ~73% of the eggplant genome was found to include transposable elements (TEs). In total, 34,916 protein-coding genes were predicted, confirming that the diploid gene number in the Solanaceae is around 35,000, as previously reported for tomato (Solanum lycopersicum L.), potato (S. tuberosum L.) and pepper (Capsicum spp.). A total of 108,360 protein sequences from eggplant, pepper and potato were clustered into 22,337 gene families (excluding singletons) using OrthoMCL, with 12,568 gene families (comprising 76,920 genes) in common between the four Solanaceae crops, while 674 eggplant-specific clusters containing 1999 genes were identified. The high-quality eggplant genome sequence offers the possibility to perform comparative genomic studies within species, in order to find variation across individuals for genetic association and linkage analyses, as well as between species, with the goal to perform evolutionary studies. Furthermore, it provides a key resource for the understanding the Solanaceae biology and a key tool for future breeding programmes. The newly developed eggplant genome was also surveyed for the identification of single-locus SSR markers and nearly 133,000 perfect SSRs, a density of 125.5 SSRs/Mbp, as well as about 178,400 imperfect SSRs were identified. Using these data, a public dynamic microsatellite database was developed (www.eggplantmicrosatellite.org), which represents a one-stop resource for the global community of scientists and breeders.
2019
The Eggplant Genome
Springer Nature
Compendium of Plant Genomes
71
80
978-3-319-99207-5
https://doi.org/10.1007/978-3-319-99208-2_8; https://link.springer.com/chapter/10.1007/978-3-319-99208-2_8
Sergio Lanteri; Lorenzo Barchi
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/1718659
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact