Title: Statistical methods for profiling 3-dimensional chromatin interactions from repetitive regions of genomes
Authors: Ye Zheng - University of Wisconsin Madison (United States) [presenting]
Ferhat Ay - La Jolla Institute for Allergy and Immunology (United States)
Sunduz Keles - University of Wisconsin Madison (United States)
Abstract: Recently developed chromatin conformation capture-based assays enabled the study of 3-dimensional chromosome architecture in a high throughput fashion. Hi-C, particularly, elucidated genome-wide long-range interactions among loci. Although the number of statistical analysis and inference methods for Hi-C data is growing rapidly, a key impediment of available approaches is their inability to accommodate reads that align to multiple locations, i.e. multi-mapping reads. This is a key obstacle of current Hi-C pipelines and hinders the comprehensive investigation of both intra-chromosomal and inter-chromosomal interactions involving repetitive regions. We developed mHi-C, multi-mapping strategy for Hi-C data, integrates a hierarchical model to probabilistically allocate multi-mapping reads to their most likely genomic interaction positions. Application on published Hi-C data with varying sequencing depths demonstrates that a large fraction of novel significant contacts originates from heterochromatin regions of the genome, which were discarded in typical Hi-C pipeline due to their repetitive structure. Further analysis of these newly detected contacts for potential promoter-enhancer interactions highlights the importance of long-range contacts originating from duplicated segments. mHi-C is organized into a complete work-flow, starting from read alignment to significant contact detection, as a flexible and robust python pipeline which allows each main step to be run independently.