Figure - available from: Genome Biology
This content is subject to copyright. Terms and conditions apply.
5C analysis performance. HiFive normalization of 5C data and their correlation to corresponding HiC data. a Correlation of 5C data (intra-regional only) with the same cell type and bin-coordinates in HiC data, normalized using HiFive’s probability algorithm for two different datasets and using each of HiFive’s algorithms. b Heatmaps for a select region from each dataset, un-normalized, normalized using HiFive’s probability algorithm, and the corresponding HiC data, normalized and dynamically binned

5C analysis performance. HiFive normalization of 5C data and their correlation to corresponding HiC data. a Correlation of 5C data (intra-regional only) with the same cell type and bin-coordinates in HiC data, normalized using HiFive’s probability algorithm for two different datasets and using each of HiFive’s algorithms. b Heatmaps for a select region from each dataset, un-normalized, normalized using HiFive’s probability algorithm, and the corresponding HiC data, normalized and dynamically binned

Source publication
Article
Full-text available
The chromatin interaction assays 5C and HiC have advanced our understanding of genomic spatial organization, but analysis approaches for these data are limited by usability and flexibility. The HiFive tool suite provides efficient data handling and a variety of normalization approaches for easy, fast analysis and method comparison. Integration of M...

Similar publications

Preprint
Full-text available
The chromatin interaction assays 5C and HiC have advanced our understanding of genomic spatial organization but analysis approaches for these data are limited by usability and flexibility. The HiFive tool suite provides efficient data handling and a variety of normalization approaches for easy, fast analysis and method comparison. Integration of MP...

Citations

... HOMER [44], implemented in Perl and C++, offers motif discovery, functional annotation, and Hi-C data analysis. HiFive, implemented in Python and Cython, features functionalities for contact matrix normalization, TAD calling and differential analysis [41]. HiCdat [38], an R package, aids in visualizing and analyzing Hi-C data, exploring chromatin interactions and structure. ...
Article
Genomic data analysis has witnessed a surge in complexity and volume, primarily driven by the advent of high-throughput technologies. In particular, studying chromatin loops and structures has become pivotal in understanding gene regulation and genome organization. This systematic investigation explores the realm of specialized bioinformatics pipelines designed specifically for the analysis of chromatin loops and structures. Our investigation incorporates two protein (CTCF and Cohesin) factor-specific loop interaction datasets from six distinct pipelines, amassing a comprehensive collection of 36 diverse datasets. Through a meticulous review of existing literature, we offer a holistic perspective on the methodologies, tools and algorithms underpinning the analysis of this multifaceted genomic feature. We illuminate the vast array of approaches deployed, encompassing pivotal aspects such as data preparation pipeline, preprocessing, statistical features and modelling techniques. Beyond this, we rigorously assess the strengths and limitations inherent in these bioinformatics pipelines, shedding light on the interplay between data quality and the performance of deep learning models, ultimately advancing our comprehension of genomic intricacies.
... More details are shown in Supplementary Fig.S7. Taking into account the unique characteristics of the Hi-C matrix, such as sequencing depth, distance dependence, and domain structure, we evaluate the biological reproducibility of the two matrices by adopting the GenomeDISCO (Ursu et al., 2018), HiCRep (Yang et al., 2017), HiC-Spector (Yan et al., 2017) and QuASAR-Rep (Sauria et al., 2015) tools to calculate the scores. We use 3DChromatin_ReplicateQC (Yardımcı et al., 2019), which is composed of these four repeatability measures. ...
Article
Full-text available
Motivation: Hi-C technology has been the most widely used chromosome conformation capture(3C) experiment that measures the frequency of all paired interactions in the entire genome, which is a powerful tool for studying the 3D structure of the genome. The fineness of the constructed genome structure depends on the resolution of Hi-C data. However, due to the fact that high-resolution Hi-C data require deep sequencing and thus high experimental cost, most available Hi-C data are in low-resolution. Hence, it is essential to enhance the quality of Hi-C data by developing the effective computational methods. Results: In this work, we propose a novel method, so-called DFHiC, which generates the high-resolution Hi-C matrix from the low-resolution Hi-C matrix in the framework of the dilated convolutional neural network. The dilated convolution is able to effectively explore the global patterns in the overall Hi-C matrix by taking advantage of the information of the Hi-C matrix in a way of the longer genomic distance. Consequently, DFHiC can improve the resolution of the Hi-C matrix reliably and accurately. More importantly, the super-resolution Hi-C data enhanced by DFHiC is more in line with the real high-resolution Hi-C data than those done by the other existing methods, in terms of both chromatin significant interactions and identifying topologically associating domains (TADs). Availability: https://github.com/BinWangCSU/DFHiC. Supplementary information: Supplementary data are available at Bioinformatics online.
... One approach corrects raw individual Hi-C maps for known biases, including GC content and mappability [11]. Other programs, such as HiCNorm [12], HiFive [13], Hi-Corrector [14] and HiCorr [15], normalize Hi-C data for unknown biases, assuming that all differences in genome coverage are caused by technical reasons and should be eliminated. HiCrep [16], GenomeDISCO [17], QuASAR [18] and HiC-Spector [19] approaches use a variety of metrics to compare Hi-C maps and estimate the reproducibility of experiments. ...
Article
Full-text available
The chromatin interaction assays, particularly Hi-C, enable detailed studies of genome architecture in multiple organisms and model systems, resulting in a deeper understanding of gene expression regulation mechanisms mediated by epigenetics. However, the analysis and interpretation of Hi-C data remain challenging due to technical biases, limiting direct comparisons of datasets obtained in different experiments and laboratories. As a result, removing biases from Hi-C-generated chromatin contact matrices is a critical data analysis step. Our novel approach, HiConfidence, eliminates biases from the Hi-C data by weighing chromatin contacts according to their consistency between replicates so that low-quality replicates do not substantially influence the result. The algorithm is effective for the analysis of global changes in chromatin structures such as compartments and topologically associating domains. We apply the HiConfidence approach to several Hi-C datasets with significant technical biases, that could not be analyzed effectively using existing methods, and obtain meaningful biological conclusions. In particular, HiConfidence aids in the study of how changes in histone acetylation pattern affect chromatin organization in Drosophila melanogaster S2 cells. The method is freely available at GitHub: https://github.com/victorykobets/HiConfidence.
... Finally, interaction contact matrices are generated using valid interaction counts and normalized for distance and background signals using statistical methods such as quantile normalization [153,154]. Software such as HiFive [155] and my5C [156] have been developed for 5C data analysis. HiFive is capable of mapping, filtering, normalizing, and visualizing 5C as well as Hi-C data sets, allowing users to analyze the data with a single program [155]. ...
... Software such as HiFive [155] and my5C [156] have been developed for 5C data analysis. HiFive is capable of mapping, filtering, normalizing, and visualizing 5C as well as Hi-C data sets, allowing users to analyze the data with a single program [155]. ...
Article
Full-text available
Epigenetic marks do not change the sequence of DNA but affect gene expression in a cell-type specific manner by altering the activities of regulatory elements. Development of new molecular biology assays, sequencing technologies, and computational approaches enables us to profile the human epigenome in three-dimensional structure genome-wide. Here we describe various molecular biology techniques and bioinformatic tools that have been developed to measure the activities of regulatory elements and their chromatin interactions. Moreover, we list currently available three-dimensional epigenomic data sets that are generated in various human cell types and tissues to assist in the design and analysis of research projects.
... In addition, it is natively compatible and inter-convertible with the widespread Cooler [38] and Juicer [37] Hi-C file formats and can import a large variety of different text-based matrix inputs, such as those generated by HiC-Pro [42] Table 1 Feature comparison of different Hi-C analysis tools. Tools included in the comparison are Cooler [38]/HiGlass [39], Juicer [37]/Juicebox [40], HOMER [41], HiC-Pro [42], HiC-bench [43], TADbit [44], HiFive [45], HicDat [46], HiCInspector [47], HiCUP, HiCExplorer [48,49], and HiCeekR [50]. 1: Only for interactive plotting; 2: Support for Juicer and Cooler multi-resolution files, but no native support; 3: Cooler ecosystem includes pairtools, cooler, cooltools, HiGlass, and distiller; 4: In conjunction with Juicebox; 5: Provides instructions for mapping, but no dedicated command; 6: Visualisation through Treeview; 7: With export for Fit-Hi-C; 8: Through compatibility with HiCPlotter; 9: Via HiCNorm; 10: Fit-Hi-C, C-loops, and targeted virtual 5C (in-house); 11: Only pre-processing; 12 Table 1 Feature comparison of different Hi-C analysis tools. ...
... 1: Only for interactive plotting; 2: Support for Juicer and Cooler multi-resolution files, but no native support; 3: Cooler ecosystem includes pairtools, cooler, cooltools, HiGlass, and distiller; 4: In conjunction with Juicebox; 5: Provides instructions for mapping, but no dedicated command; 6: Visualisation through Treeview; 7: With export for Fit-Hi-C; 8: Through compatibility with HiCPlotter; 9: Via HiCNorm; 10: Fit-Hi-C, C-loops, and targeted virtual 5C (in-house); 11: Only pre-processing; 12 Table 1 Feature comparison of different Hi-C analysis tools. Tools included in the comparison are Cooler [38]/HiGlass [39], Juicer [37]/Juicebox [40], HOMER [41], HiC-Pro [42], HiC-bench [43], TADbit [44], HiFive [45], HicDat [46], HiCInspector [47], HiCUP, HiCExplorer [48,49], and HiCeekR [50] Table 1 Feature comparison of different Hi-C analysis tools. Tools included in the comparison are Cooler [38]/HiGlass [39], Juicer [37]/Juicebox [40], HOMER [41], HiC-Pro [42], HiC-bench [43], TADbit [44], HiFive [45], HicDat [46], HiCInspector [47], HiCUP, HiCExplorer [48,49], and HiCeekR [50] [39], Juicer [37]/Juicebox [40], HOMER [41], HiC-Pro [42], HiC-bench [43], TADbit [44], HiFive [45], HicDat [46], HiCInspector [47], HiCUP, HiCExplorer [48,49], and HiCeekR [50] and the 4D Nucleome project [51]. ...
... Tools included in the comparison are Cooler [38]/HiGlass [39], Juicer [37]/Juicebox [40], HOMER [41], HiC-Pro [42], HiC-bench [43], TADbit [44], HiFive [45], HicDat [46], HiCInspector [47], HiCUP, HiCExplorer [48,49], and HiCeekR [50] Table 1 Feature comparison of different Hi-C analysis tools. Tools included in the comparison are Cooler [38]/HiGlass [39], Juicer [37]/Juicebox [40], HOMER [41], HiC-Pro [42], HiC-bench [43], TADbit [44], HiFive [45], HicDat [46], HiCInspector [47], HiCUP, HiCExplorer [48,49], and HiCeekR [50] [39], Juicer [37]/Juicebox [40], HOMER [41], HiC-Pro [42], HiC-bench [43], TADbit [44], HiFive [45], HicDat [46], HiCInspector [47], HiCUP, HiCExplorer [48,49], and HiCeekR [50] and the 4D Nucleome project [51]. FAN-C includes a fully automated FASTQ-tomatrix pipeline, which can be adapted to accommodate the complexities and individual requirements of each specific Hi-C analysis, such as different species or analysis parameters. ...
Article
Full-text available
Chromosome conformation capture data, particularly from high-throughput approaches such as Hi-C, are typically very complex to analyse. Existing analysis tools are often single-purpose, or limited in compatibility to a small number of data formats, frequently making Hi-C analyses tedious and time-consuming. Here, we present FAN-C, an easy-to-use command-line tool and powerful Python API with a broad feature set covering matrix generation, analysis, and visualisation for C-like data (https://github.com/vaquerizaslab/fanc). Due to its compatibility with the most prevalent Hi-C storage formats, FAN-C can be used in combination with a large number of existing analysis tools, thus greatly simplifying Hi-C matrix analysis.
... ; https://doi.org/10.1101/2020.10.07.325332 doi: bioRxiv preprint Figure S4: Representative example of a heatmap region containing highly conserved long-range contacts, i.e. two genomic intervals: 87300kb-88100kb and 111900kb-112700kb in chromosome 2. Resolution:10 kb. Raw Hi-C reads of four cell types were from Rao et al. [2014], aligned by Bowtie2, and then processed by HiFive [Sauria et al., 2015]. Here "Binning" and "Express" are two normalization algorithms in HiFive. ...
Preprint
Full-text available
Three-dimensional chromosomal structure plays an important role in gene regulation. Chromosome conformation capture techniques, especially the high-throughput, sequencing-based technique Hi-C, provide new insights on spatial architectures of chromosomes. However, Hi-C data contains artifacts and systemic biases that substantially influence subsequent analysis. Computational models have been developed to address these biases explicitly, however, it is difficult to enumerate and eliminate all the biases in models. Other models are designed to correct biases implicitly, but they will also be invalid in some situations such as copy number variations. We characterize a new kind of artifact in Hi-C data. We find that this artifact is caused by incorrect alignment of Hi-C reads against approximate repeat regions and can lead to erroneous chromatin contact signals. The artifact cannot be corrected by current Hi-C correction methods. We design a probabilistic method and develop a new Hi-C processing pipeline by integrating our probabilistic method with the HiC-Pro pipeline. We find that the new pipeline can remove this new artifact effectively, while preserving important features of the original Hi-C matrices.
... These computational methods and tools can be coarsely divided into two categories, Hi-C data processing and downstream analysis. For the first category, there are some existing tools used to generate valid chromatin interactions from raw sequencing reads [4][5][6][7][8][9][10][11][12]. They follow similar processing steps and may adopt different sequence alignment strategies (pre-truncation, iterative and trimming), filtering criteria (read-level, read-pair level, strand and distance) and normalization methods (explicit-factor correction, matrix balancing and joint correction). ...
... They follow similar processing steps and may adopt different sequence alignment strategies (pre-truncation, iterative and trimming), filtering criteria (read-level, read-pair level, strand and distance) and normalization methods (explicit-factor correction, matrix balancing and joint correction). Besides, there are some computational tools to exam the quality of Hi-C data by measuring the reproducibility of Hi-C replicates [10,[13][14][15]. For the second category, there are several major analysis tasks to gain insights into the spatial structure and function of chromatin. ...
Article
Full-text available
Background: Chromosome conformation capture-based methods, especially Hi-C, enable scientists to detect genome-wide chromatin interactions and study the spatial organization of chromatin, which plays important roles in gene expression regulation, DNA replication and repair etc. Thus, developing computational methods to unravel patterns behind the data becomes critical. Existing computational methods focus on intrachromosomal interactions and ignore interchromosomal interactions partly because there is no prior knowledge for interchromosomal interactions and the frequency of interchromosomal interactions is much lower while the search space is much larger. With the development of single-cell technologies, the advent of single-cell Hi-C makes interrogating the spatial structure of chromatin at single-cell resolution possible. It also brings a new type of frequency information, the number of single cells with chromatin interactions between two disjoint chromosome regions. Results: Considering the lack of computational methods on interchromosomal interactions and the unsurprisingly frequent intrachromosomal interactions along the diagonal of a chromatin contact map, we propose a computational method dedicated to analyzing interchromosomal interactions of single-cell Hi-C with this new frequency information. To the best of our knowledge, our proposed tool is the first to identify regions with statistically frequent interchromosomal interactions at single-cell resolution. We demonstrate that the tool utilizing networks and binomial statistical tests can identify interesting structural regions through visualization, comparison and enrichment analysis and it also supports different configurations to provide users with flexibility. Conclusions: It will be a useful tool for analyzing single-cell Hi-C interchromosomal interactions.
... To offer more flexibility, it is also possible to install Galaxy HiC-Explorer on a local Galaxy instance. Hi-C data processing and downstream analysis are supported by many tool suites, such as Juicer (18), HiCUP (19), HOMER (20), HiC-Pro (21), HiFive (22) and the recently published HiCeekR (23). Juicer, HiC-Pro and HiCeekR offer several tools but are limited to a local installation. ...
Article
Full-text available
The Galaxy HiCExplorer provides a web service at https://hicexplorer.usegalaxy.eu. It enables the integrative analysis of chromosome conformation by providing tools and computational resources to pre-process, analyse and visualize Hi-C, Capture Hi-C (cHi-C) and single-cell Hi-C (scHi-C) data. Since the last publication, Galaxy HiCExplorer has been expanded considerably with new tools to facilitate the analysis of cHi-C and to provide an in-depth analysis of Hi-C data. Moreover, it supports the analysis of scHi-C data by offering a broad range of tools. With the help of the standard graphical user interface of Galaxy, presented workflows, extensive documentation and tutorials, novices as well as Hi-C experts are supported in their Hi-C data analysis with Galaxy HiCExplorer.
... Yet, it has a broader scope of applications. Future work will expand the utility of boundary score by developing a similarity/reproducibility score to measure the agreement between (multiple) Hi-C matrices, in the same vein as HiCRep (Yang et al., 2017), Selfish (Ardakany et al., 2019), GenomeDISCO (Ursu et al., 2018), HiC-Spector (Yan et al., 2017), QuASAR-Rep (Sauria et al., 2015). Furthermore, for differential boundary detection, our method is still limited to the comparison of two profiles of (consensus) boundary scores. ...
Article
Full-text available
Recent research using chromatin conformation capture technologies, such as Hi-C, has demonstrated the importance of topologically associated domains (TADs) and smaller chromatin loops, collectively referred hereafter as “interacting domains.” Many such domains change during development or disease, and exhibit cell- and condition-specific differences. Quantification of the dynamic behavior of interacting domains will help to better understand genome regulation. Methods for comparing interacting domains between cells and conditions are highly limited. We developed TADCompare, a method for differential analysis of boundaries of interacting domains between two or more Hi-C datasets. TADCompare is based on a spectral clustering-derived measure called the eigenvector gap, which enables a loci-by-loci comparison of boundary differences. Using this measure, we introduce methods for identifying differential and consensus boundaries of interacting domains and tracking boundary changes over time. We further propose a novel framework for the systematic classification of boundary changes. Colocalization- and gene enrichment analysis of different types of boundary changes demonstrated distinct biological functionality associated with them. TADCompare is available on https://github.com/dozmorovlab/TADCompare and Bioconductor (submitted).
... BAM files were normalized, binned and analyzed for interaction with EOTr loci using the HiFive tool v 1.5.6. (Supplementary file Methods) (66). ...
Article
Full-text available
Analysis of ENCODE long RNA-Seq and ChIP-seq (Chromatin Immunoprecipitation Sequencing) datasets for HepG2 and HeLa cell lines uncovered 1647 and 1958 transcripts that interfere with transcription factor binding to human enhancer domains. TFBSs (Transcription Factor Binding Sites) intersected by these 'Enhancer Occlusion Transcripts' (EOTrs) displayed significantly lower relative transcription factor (TF) binding affinities compared to TFBSs for the same TF devoid of EOTrs. Expression of most EOTrs was regulated in a cell line specific manner; analysis for the same TFBSs across cell lines, i.e. in the absence or presence of EOTrs, yielded consistently higher relative TF/DNA-binding affinities for TFBSs devoid of EOTrs. Lower activities of EOTr-associated enhancer domains coincided with reduced occupancy levels for histone tail modifications H3K27ac and H3K9ac. Similarly, the analysis of EOTrs with allele-specific expression identified lower activities for alleles associated with EOTrs. ChIA-PET (Chromatin Interaction Analysis by Paired-End Tag Sequencing) and 5C (Carbon Copy Chromosome Conformation Capture) uncovered that enhancer domains associated with EOTrs preferentially interacted with poised gene promoters. Analysis of EOTr regions with GRO-seq (Global run-on) data established the correlation of RNA polymerase pausing and occlusion of TF-binding. Our results implied that EOTr expression regulates human enhancer domains via transcriptional interference.