linquynus
diff --git a/‎inst/extdata/motif_consistency.png
83.8 KB b/‎inst/extdata/motif_consistency.png
83.8 KB
diff --git a/‎vignettes/TFregulomeR.Rmd
Lines changed: 27 additions & 21 deletions b/‎vignettes/TFregulomeR.Rmd
Lines changed: 27 additions & 21 deletions
@@ -43,21 +43,27 @@ source, TF source ID, number of peaks and number of peaks with motif.
 In TFregulomeR project, we
 used MEME-ChIP to perform motif de novo discovery in each ChIP-seq. Highly and centrally enriched 
 motifs were selected and compared with the existing TFBS databases, such as HOCOMOCO and JASPAR. 
-Around 6% of highly enriched motifs were not consistent with the TFBS databases, and it might
-be due to the fact that in those cell types, the given TFs are indirectly recruited to genome. 
-Besides, approximately 9% of motifs were not recorded for their corresponding TFs in the databases. 
-In order to confirm the reliability of these 15% motifs, we used HOMER to perform a de novo motif 
+89 highly enriched motifs were not consistent with the TFBS databases. it might
+be due to the fact that in those cell types, the given TFs are indirectly recruited to genome, 
+and/or that highly abundant presence of cohesion and polycomb group proteins masks the motif 
+enrichment of the ChIP’ed TF. 
+Besides, 136 motifs were not recorded for their corresponding TFs in the databases. 
+In order to confirm the reliability of these 225 motifs, we used HOMER to perform a de novo motif 
 discovery again. Motif results by HOMER were compared with those by MEME-ChIP and their similarity
-were measured by normalised Pearson correlation coefficient using the formula: Ncor = cor * w / w_smaller, 
+were measured by normalised Pearson correlation coefficient using compare-matrices function in 
+RSAT with the formula: Ncor = cor * w / w_smaller, 
 where cor is raw Pearson correlation coefficient, w is the alignment width of two matrices from 
 MEME-ChIP and HOMER (the minimum value of w was set as 5), and w_smaller is the width of smaller 
 motifs from MEME-ChIP and HOMER. We found that majority of those PWM matrices generated by MEME-ChIP, 
 a combined algorithm suite of expectation maximization and regular expressions, were able to be 
-recapitulated by HOMER, which takes advantage of hypergeometric enrichment. We have added the
-information into the last two columns of `dataBrowser` output (from v1.2.0).
+recapitulated by HOMER, which takes advantage of hypergeometric enrichment (Figure 1). We have added the information into the last two columns of `dataBrowser` output (from v1.2.0).
 
 In particular, if no input is given for the function, all records in TFregulomeR compendium will be returned.
 
+```{r echo=FALSE, fig.cap="Figure 1. Similarity of de novo enriched motifs by MEME-ChIP and HOMER. The beeswarm and violin plots show the normalised Pearson correlation coefficient of de novo motifs called by MEME-ChIP and HOMER, and the red dash denotes normalised Pearson correlation coefficient value 0.7.", out.width = '80%',fig.align="center"}
+knitr::include_graphics("../inst/extdata/motif_consistency.png")
+```
+
 ```{r eval=FALSE}
 library(TFregulomeR)
 # browse all records in TFregulomeR TFBS compendium
@@ -158,7 +164,7 @@ class(K562_CEBPB)
 #> [1] "TFregulomeR"
 ```
 
-After obtaining a MethMotif matrix, user can use `plotLogo` to plot logo as below (Figure 1). 
+After obtaining a MethMotif matrix, user can use `plotLogo` to plot logo as below (Figure 2). 
 If the TFBS source is MethMotif, then a MethMotif logo will be saved. Two options are available for motif logo, 
 "entropy" and "frequency", and also different methylation levels 
 ("all", "methylated" and "unmethylated") can be opted for methylation bar charts. However, 
@@ -194,7 +200,7 @@ plotLogo(HL60_CEBPB, logo_type = "entropy")
 #> Success: a PDF named 'GTRD-EXP040801_HSA_HL-60_CEBPB-logo-entropy.pdf' has been saved!
 ```
 
-```{r echo=FALSE, fig.cap="Figure 1. CEBPB (Meth)Motif logos in K562 and HL-60", out.width = '80%',fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 2. CEBPB (Meth)Motif logos in K562 and HL-60", out.width = '80%',fig.align="center"}
 knitr::include_graphics("../inst/extdata/plotLogo.png")
 ```
 
@@ -252,7 +258,7 @@ head(K562_CEBPB_peaks)
 ## Study TFBS propensity
 
 ### Common peak regions
-```{r echo=FALSE, fig.cap="Figure 2. functionality in TFregulomeR for common peak analysis", out.width = '100%',fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 3. functionality in TFregulomeR for common peak analysis", out.width = '100%',fig.align="center"}
 knitr::include_graphics("../inst/extdata/commonPeaks_tutorial.png")
 ```
 
@@ -430,13 +436,13 @@ methylation_profile$HCT116_CEBPB_common_peaks
 #> [1,]   NA
 ```
 
-```{r echo=FALSE, fig.cap="Figure 3. MethMotif logo of K562 CEBPB common peaks", out.width = '40%', fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 4. MethMotif logo of K562 CEBPB common peaks", out.width = '40%', fig.align="center"}
 knitr::include_graphics("../inst/extdata/MM1_HSA_K562_CEBPB_common_peaks-logo-entropy.png")
 ```
 
 ### Exclusive peak regions
 
-```{r echo=FALSE, fig.cap="Figure 4. functionality in TFregulomeR for exclusive peak analysis", out.width = '100%',fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 5. functionality in TFregulomeR for exclusive peak analysis", out.width = '100%',fig.align="center"}
 knitr::include_graphics("../inst/extdata/exclusivePeaks_tutorial.png")
 ```
 
@@ -589,13 +595,13 @@ methylation_profile$MM1_HSA_K562_CEBPB_exclusive_peaks
 #> 90-100%      40 #40 CpG methylation scores are more than 0.9 ((homogeneously methylated)) in +/-100bp window around exclusive peak summits
 ```
 
-```{r echo=FALSE, fig.cap="Figure 5. MethMotif logo of K562 CEBPB exclusive peaks", out.width = '40%', fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 6. MethMotif logo of K562 CEBPB exclusive peaks", out.width = '40%', fig.align="center"}
 knitr::include_graphics("../inst/extdata/MM1_HSA_K562_CEBPB_exclusive_peaks-logo-entropy.png")
 ```
 
 ### Intersected peak matrix
 
-```{r echo=FALSE, fig.cap="Figure 6. functionality in TFregulomeR for cofactor and interactome analysis", out.width = '100%',fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 7. functionality in TFregulomeR for cofactor and interactome analysis", out.width = '100%',fig.align="center"}
 knitr::include_graphics("../inst/extdata/intersectPeakMatrix_tutorial.png")
 ```
 
@@ -969,15 +975,15 @@ cofactorReport(intersectPeakMatrix = intersectMatrix_exclusive)
 #> ... ... ... Cofactor report for id 'MM1_HSA_K562_CEBPB' has been saved as MM1_HSA_K562_CEBPB_cofactor_report.pdf
 ```
 
-```{r echo=FALSE, fig.cap="Figure 7. MethMotif logo of K562 CEBPB common peaks intersected with K562 CEBPD peaks", out.width = '40%', fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 8. MethMotif logo of K562 CEBPB common peaks intersected with K562 CEBPD peaks", out.width = '40%', fig.align="center"}
 knitr::include_graphics("../inst/extdata/MM1_HSA_K562_CEBPB_overlapped_with_MM1_HSA_K562_CEBPD-logo-entropy.png")
 ```
 
-```{r echo=FALSE, fig.cap="Figure 8. MethMotif logo of K562 CEBPB exclusive peaks intersected with K562 ATF4 peaks", out.width = '40%', fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 9. MethMotif logo of K562 CEBPB exclusive peaks intersected with K562 ATF4 peaks", out.width = '40%', fig.align="center"}
 knitr::include_graphics("../inst/extdata/MM1_HSA_K562_CEBPB_overlapped_with_MM1_HSA_K562_ATF4-logo-entropy.png")
 ```
 
-```{r echo=FALSE, fig.cap="Figure 9. cofactorReport output", out.width = '100%', fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 10. cofactorReport output", out.width = '100%', fig.align="center"}
 knitr::include_graphics("../inst/extdata/cofactor_report.png")
 ```
 
@@ -1071,7 +1077,7 @@ plotDistrib(motifDistrib = motifDistrib_output)
 #> Distribution of motif MM1_HSA_K562_CEBPB in peak set my_peak has been saved!
 ```
 
-```{r echo=FALSE, fig.cap="Figure 10. Motif distribution", out.width = '100%', fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 11. Motif distribution", out.width = '100%', fig.align="center"}
 knitr::include_graphics("../inst/extdata/motif_distribution.png")
 ```
 
@@ -1124,13 +1130,13 @@ head(K562_CEBPB_exclusivePeak_loc)
 #> 6               77;117;73
 ```
 
-```{r echo=FALSE, fig.cap="Figure 11. HTML annotation report of the genomic locations of K562 CEBPB exclusive peak", out.width = '100%', fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 12. HTML annotation report of the genomic locations of K562 CEBPB exclusive peak", out.width = '100%', fig.align="center"}
 knitr::include_graphics("../inst/extdata/genomeAnnotate.png")
 ```
 
 ## Annotate TFBS functions
 
-The key function of transcription factors is to regulate gene expression. By working with Genomic Regions Enrichment of Annotations Tool (GREAT), TFregulomeR allows users to annotate the functions of TFBSs using `greatAnnotate`. Given that GREAT server doesn't support hg38, liftOver R package has been incorporated in TFregulomeR to convert hg38 to hg19. The annotation output of `greatAnnotate` is intuitive, not only will a data.frame containing annotation results be returned, but also an HTML report will be saved. The HTML report takes advantage of `rbokeh` package, which presents a vivid and dynamic interface (Figure 12).
+The key function of transcription factors is to regulate gene expression. By working with Genomic Regions Enrichment of Annotations Tool (GREAT), TFregulomeR allows users to annotate the functions of TFBSs using `greatAnnotate`. Given that GREAT server doesn't support hg38, liftOver R package has been incorporated in TFregulomeR to convert hg38 to hg19. The annotation output of `greatAnnotate` is intuitive, not only will a data.frame containing annotation results be returned, but also an HTML report will be saved. The HTML report takes advantage of `rbokeh` package, which presents a vivid and dynamic interface (Figure 13).
 
 ```{r eval=FALSE}
 # annotate the functions of K562 CEBPB exclusive peaks
@@ -1172,7 +1178,7 @@ head(K562_CEBPB_exclusivePeak_func)
 ```
 
 
-```{r echo=FALSE, fig.cap="Figure 12. HTML annotation report of the genes targeted by K562 CEBPB exclusive peak", out.width = '100%', fig.align="center"}
+```{r echo=FALSE, fig.cap="Figure 13. HTML annotation report of the genes targeted by K562 CEBPB exclusive peak", out.width = '100%', fig.align="center"}
 knitr::include_graphics("../inst/extdata/greatAnnotate.png")
 ```