Complex diseases result from the interaction of numerous genetic and environmental determinants, yet the precise contribution of genetic variation remains only partly resolved. Common variants capture only a fraction of individual susceptibility, even when summarized in Polygenic Risk Scores (PRS), whereas rare variants may exert larger effect sizes but are harder to model due to their low frequency. The main aim of this thesis is to provide deep insight into the power of genetic variations in predicting complex disease risk, with different characteristics depending on the specific pathology. We investigated different complex diseases: coronary artery disease (CAD), breast cancer (BC) and metabolic dysfunction-associated steatotic liver disease (MASLD). CAD has a large polygenic component and it could be a suitable disease to exploit PRS in disease prevention. We evaluated the predictive power and transferability of different published PRSs, in particular those based on European ancestry in the Italian population, by examining whether the individual PGSs improve risk prediction beyond conventional risk factors. Analysing data from two different projects EPICOR and ATVB, this study revealed that 100 PRSs out of 266 could identify those patients with a CAD high risk, demonstrating that PRS could allow an early identification of at high risk individuals leading to early interventions (eg, statins, aspirin, etc). Furthermore, the results suggested that a high PRS is associated with the risk of a more severe disease. Evaluating the PRS performance across the genetic heterogeneous Italian population, PRS performance fluctuated across different Italian macro-areas. Another disease with an important polygenic component is the BC. To solve the problem of PRS uncertainty in BC risk prediction, we built a PRS through a new algorithm starting from the previous published PRSs, identifying 4,606 variants consistently related with BC, to improve the robustness and transferability across the different ancestries. PGS4606 was evaluated in 4 large European biobanks: Genomics England, UK biobank, FinnGen, HUNT and in a new enrolled UNITO cohort. This accurate selection of variants allowed raising comparable performance across UK Biobank different ancestries. PRS was integrated with rare variants in BC risk genes using UNITO cohort to improve their feasibility. Despite BRCA1/2 variants had a major role in stratifying the BC lifetime risk, PRS4606 improved risk estimation in non-carriers patients and those with a mutation on moderate BC risk genes as ATM, CHECK2, BARD1, BRIP1, RAD51C and RAD51D. Finally, MASLD is a disease strongly influenced by the lifestyle and PRS had poor performance in severity prediction. To exploit potential of rare genetic variants underlying the disease progression, we analysed by Whole exome sequencing two cohorts of patients affected by MASLD, from geographically distinct populations, one from the University of Turin (154 patients) and one from Columbia University (151 patients). This study highlighted two variants with prior metabolic links: rs4252128 in the PLG gene (associated with lipoprotein levels) and rs2257401 in CYP3A7 (correlated with adiposity and BMI). Gene-level analyses revealed five MASH-associated signals in both cohorts: FANCM, TMEM201, CALHM3, TSPO2, and EXTL1, with enhanced evidence in combined meta-analysis (p < 0.001). Moreover, enrichment analysis showed robust involvement of ciliumrelated pathways in MASH vs MASL in both cohorts (FDR < 10⁻⁵). These represent interesting examples of how rare and common genetic variants can enhance disease risk prediction, ultimately improving patient stratification, supporting clinicians with more precise guidance when selecting preventive strategies, or treatment pathways

Quantifying Genetic Risk in Complex Diseases through the integration of Rare and Common Variants(2025 Dec 18).

Quantifying Genetic Risk in Complex Diseases through the integration of Rare and Common Variants

DEBERNARDI, CARLA
2025-12-18

Abstract

Complex diseases result from the interaction of numerous genetic and environmental determinants, yet the precise contribution of genetic variation remains only partly resolved. Common variants capture only a fraction of individual susceptibility, even when summarized in Polygenic Risk Scores (PRS), whereas rare variants may exert larger effect sizes but are harder to model due to their low frequency. The main aim of this thesis is to provide deep insight into the power of genetic variations in predicting complex disease risk, with different characteristics depending on the specific pathology. We investigated different complex diseases: coronary artery disease (CAD), breast cancer (BC) and metabolic dysfunction-associated steatotic liver disease (MASLD). CAD has a large polygenic component and it could be a suitable disease to exploit PRS in disease prevention. We evaluated the predictive power and transferability of different published PRSs, in particular those based on European ancestry in the Italian population, by examining whether the individual PGSs improve risk prediction beyond conventional risk factors. Analysing data from two different projects EPICOR and ATVB, this study revealed that 100 PRSs out of 266 could identify those patients with a CAD high risk, demonstrating that PRS could allow an early identification of at high risk individuals leading to early interventions (eg, statins, aspirin, etc). Furthermore, the results suggested that a high PRS is associated with the risk of a more severe disease. Evaluating the PRS performance across the genetic heterogeneous Italian population, PRS performance fluctuated across different Italian macro-areas. Another disease with an important polygenic component is the BC. To solve the problem of PRS uncertainty in BC risk prediction, we built a PRS through a new algorithm starting from the previous published PRSs, identifying 4,606 variants consistently related with BC, to improve the robustness and transferability across the different ancestries. PGS4606 was evaluated in 4 large European biobanks: Genomics England, UK biobank, FinnGen, HUNT and in a new enrolled UNITO cohort. This accurate selection of variants allowed raising comparable performance across UK Biobank different ancestries. PRS was integrated with rare variants in BC risk genes using UNITO cohort to improve their feasibility. Despite BRCA1/2 variants had a major role in stratifying the BC lifetime risk, PRS4606 improved risk estimation in non-carriers patients and those with a mutation on moderate BC risk genes as ATM, CHECK2, BARD1, BRIP1, RAD51C and RAD51D. Finally, MASLD is a disease strongly influenced by the lifestyle and PRS had poor performance in severity prediction. To exploit potential of rare genetic variants underlying the disease progression, we analysed by Whole exome sequencing two cohorts of patients affected by MASLD, from geographically distinct populations, one from the University of Turin (154 patients) and one from Columbia University (151 patients). This study highlighted two variants with prior metabolic links: rs4252128 in the PLG gene (associated with lipoprotein levels) and rs2257401 in CYP3A7 (correlated with adiposity and BMI). Gene-level analyses revealed five MASH-associated signals in both cohorts: FANCM, TMEM201, CALHM3, TSPO2, and EXTL1, with enhanced evidence in combined meta-analysis (p < 0.001). Moreover, enrichment analysis showed robust involvement of ciliumrelated pathways in MASH vs MASL in both cohorts (FDR < 10⁻⁵). These represent interesting examples of how rare and common genetic variants can enhance disease risk prediction, ultimately improving patient stratification, supporting clinicians with more precise guidance when selecting preventive strategies, or treatment pathways
18-dic-2025
37
SCIENZE BIOMEDICHE ED ONCOLOGIA
MATULLO, Giuseppe
File in questo prodotto:
File Dimensione Formato  
Debernardi_phD_Thesis.pdf

Accesso aperto

Descrizione: Tesi
Dimensione 5.34 MB
Formato Adobe PDF
5.34 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2111830
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact