Three-dimensional nuclear DNA architecture comprises well-studied intra-chromosomal (cis) folding and less characterized inter-chromosomal (trans) interfaces. Current predictive models of 3D genome folding can effectively infer pairwise cis-chromatin interactions from the primary DNA sequence but generally ignore trans contacts. There is an unmet need for robust models of trans-genome organization that provide insights into their underlying principles and functional relevance. We present TwinC, an interpretable convolutional neural network model that reliably predicts trans contacts measurable through genome-wide chromatin conformation capture (Hi-C). TwinC uses a paired sequence design from replicate Hi-C experiments to learn single base pair relevance in trans interactions across two stretches of DNA. The method achieves high predictive accuracy (AUROC=0.80) on a cross-chromosomal test set from Hi-C experiments in heart tissue. Mechanistically, the neural network learns the importance of compartments, chromatin accessibility, clustered transcription factor binding and G-quadruplexes in forming trans contacts. In summary, TwinC models and interprets trans genome architecture, shedding light on this poorly understood aspect of gene regulation.

Prediction and functional interpretation of inter-chromosomal genome architecture from DNA sequence with TwinC

Bertero, Alessandro;
2024-01-01

Abstract

Three-dimensional nuclear DNA architecture comprises well-studied intra-chromosomal (cis) folding and less characterized inter-chromosomal (trans) interfaces. Current predictive models of 3D genome folding can effectively infer pairwise cis-chromatin interactions from the primary DNA sequence but generally ignore trans contacts. There is an unmet need for robust models of trans-genome organization that provide insights into their underlying principles and functional relevance. We present TwinC, an interpretable convolutional neural network model that reliably predicts trans contacts measurable through genome-wide chromatin conformation capture (Hi-C). TwinC uses a paired sequence design from replicate Hi-C experiments to learn single base pair relevance in trans interactions across two stretches of DNA. The method achieves high predictive accuracy (AUROC=0.80) on a cross-chromosomal test set from Hi-C experiments in heart tissue. Mechanistically, the neural network learns the importance of compartments, chromatin accessibility, clustered transcription factor binding and G-quadruplexes in forming trans contacts. In summary, TwinC models and interprets trans genome architecture, shedding light on this poorly understood aspect of gene regulation.
2024
bioRxiv
Jha, Anupama; Hristov, Borislav; Wang, Xiao; Wang, Sheng; Greenleaf, William J.; Kundaje, Anshul; Aiden, Erez Lieberman; Bertero, Alessandro; Noble, W...espandi
File in questo prodotto:
File Dimensione Formato  
Jha 2024 bioRxiv.pdf

Accesso aperto

Descrizione: Preprint
Tipo di file: PDF EDITORIALE
Dimensione 8.74 MB
Formato Adobe PDF
8.74 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2318/2066150
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact