Help

General information

What is ZF db?

ZRdb is a database that serves to describe both KZFPs and Repeats (including LTR,SINE,LINE, and SVAs). The database contains information on the sequences, domains and DNA binding specificities of various KZFPs, while also providing information such as the classification and consensus sequences of different Repeats. By utilizing the existing ChIP-seq data of KZFPs, ZRdb also describes the binding repression and co-evolution relationship between KZFPs and different Repeats, ranking these relationships according to reliability.

Where does the ZRdb data come from?

The descriptive data of KZFPs in ZRdb, including annotation score, sequences, cellular localization, molecular function, and involved cellular pathway information, were sourced from annotations on Uniprot. For the DNA binding specificity of each KZFP, we searched for its ChIP-seq data on Cistrome. If the ChIP-seq data was available, we used the Motif information calculated by MEME-ChIP to describe its DNA binding specificity. In cases where ChIP-seq data was unavailable, we used the predicted motif on the Zinc Finger Recognition Code as a replacement. The annotated data about repeats in ZRdb primarily comes from Dfam (https://dfam.org/home), including the description, classification, and consensus sequences of each repeat. To describe the distribution of various repeats on the human genome, we utilized the repeats annotation files on hg19 and hg38 provided in IGV.

What is the Binding probability ratings of KZFPs for retrotransposon?

To enhance the reliability of our calculations and provide a multidimensional evaluation of KZPFs binding to retrotransposons, we have developed a rating system. This system categorizes the likelihood of KZFPs binding to retrotransposons into three levels, ranging from low to high. At the first level, we identify ChIP-seq peaks that intersect with specific repeats sequences in the genome using bed files provided by Cistrome and the annotation data of repeats in the human genome for calculation. At the second level, we use the most confident motif position obtained from MEME-ChIP analysis in the retrotransposon sequences to calculate the binding probability. Finally, the third level builds upon the second level by searching for the corresponding region of a certain retrotransposon type on its consensus sequences.

Summary

Domain of KZFPs :

KRAB, ZF, and degenerated ZF domains of KZFPs.

Mya，Myr :

million years ago and million years.

Time of Emergence: :

The time when Repeats or KZFPs first emerge in the genome.

Time of Change in binding specificity :

The time at which KZPFs undergo a change in DNA-binding specificity. Resulting in a new DNA binding specificity that is identical to that of humans.

Time correlation :

The proximity of the emergence of Repeats and KZFPs in the genome. There strength of the correlation can by categorized into three levels: Highly correlated, Moderately correlated and Weakly correlated. Highly correlated indicates that the emergence time of KZFPs and Repeats is precisely the same. Moderately correlated suggests that the emergence of Repeats occurred earlier than KZFPs but no more than 15 Myr. Weakly correlated implies that the emergence of Repeats occurred later than KZFPs or earlier more than 15 Myr.

Gene Ontology Annotation of KZFPs :

The Gene Ontology (GO) describes knowledge of the biological domain based on three aspects: Molecular function, Biological process, and Cellular component.

Is there ChIP-seq data :

Is there any ChIP-seq data available for these KZFPs in the Cistrome database?

Consensus sequence of Repeats :

The consensus sequences obtained from Dfam are derived from the seed alignment using a specialized caller.

Repeats structure :

For certain families, there exists a curated collection of TE-derived proteins, which constitutes a non-redundant set. As a results, several families may lack an associated coding sequences directly linked to them. The annotation for these families is obtained from Dfam.

Motif(meme-ChIP) :

DNA binding motifs of KZFPs obtained using MEME-ChIP and subsequently mapped using pylogo.

Motif(Predicted) :

DNA binding motifs predicted using Zinc Finger Recognition Code.

Peak overlapping :

A ChIP-seq peak of KZFPs that intersects with a certain repeats sequences on the genome.

Motif in Repeats :

The motif of KZFPs identified by the MEME-ChIP is present in repeats sequences.

Motif in Repeats Consensus :

The motif of KZFPs identified by MEME-ChIP analysis can be utilized to locate the corresponding region within the consensus sequences of repeats.