Skip to content

geom_gene() displays incorrect annotations for subregions #55

@AdrianoWeber

Description

@AdrianoWeber

Hello,

I'm using your package to generate plots from a named list of regions. The script takes BAM files and a GTF file for a global region (downloaded from GENCODE and then subseted to my region of interest) and then plots several subregions. Here's an example of my workflow (heavily inspired from the [vignette]{https://showteeth.github.io/ggcoverage/}):

#Track loading
track.folder<-"C:/Users/adria/Documents/Unige/Masters_project/Data_PacBio/Fichiers_BAM"

#Loci definition
cyp <- "chr10:94760062-95072497"
cyp2c9 <- "chr10:94935603-94993561"
cyp2c19 <- "chr10:94760062-94858911"
cyp2c8 <- "chr10:95035045-95072497"
loci<-list(cyp=cyp, cyp2c19=cyp2c19, cyp2c9=cyp2c9, cyp2c8=cyp2c8)

# Adding genome annotations
gtf.file <- "CYP_info.gtf"
gtf.gr <- rtracklayer::import.gff(con = gtf.file, format = "gtf")

#Function
coverage.plot<-function(gene_name, threshold=10){
  #Loading datas from bam
  track.df <- LoadTrackFile(track.folder = track.folder, format = "bam",
                            region = loci[[gene_name]], extend = 2000, norm.method="None",
                            bin.size = 1)
  #Making start and end pos absolute (will be an issue for reproductibility, this step is  added because my position in BAM are relatives to the start)
  track.df$start <- track.df$start + 94760062
  track.df$end <- track.df$end + 94760062

  ##Preparing meta data
  #SampleName    Type Group
  sample.meta <- data.frame(SampleName= unique(track.df$Type))
  sample.meta$Type <- unlist(Map(function(x) strsplit(x,"\\.")[[1]][1], sample.meta$SampleName))
  sample.meta$Group <- c("POP1", "POP2","POP2","POP2","POP2","POP1","POP2","POP1","POP2","POP2","POP2","POP2","POP3", "POP2", "POP2", "POP2")

  #Changing Type
  track.df$Type<-as.factor(track.df$Type)
  levels(track.df$Type) <- sample.meta$Type
  ##Preparing mark.region for CYP genes (specific to my project)
  mark.region<-data.frame(start=c(94760063, 94773001, 94819540, 94935603, 94944159, 94970426, 95035045),
                          end=c(94773000, 94818512, 94858911, 94943679, 94970066, 94993561, 95072497),
                          label=c("CYP2C19_1", "CYP2C19_2", "CYP2C19_3", "CYP2C9_1", "CYP2C9_2", "CYP2C9_3", "CYP2C8"))
  #Doing a basic plot
  color_palette <- viridis::viridis(16)#changing color palette
  basic.coverage <- ggcoverage(data = track.df, color = color_palette, 
                               plot.type = "join", facet.key = "Type", group.key = "Type",
                               mark.region = mark.region, range.position = "out", mark.label.size=2,
                               mark.color="azure2")
  basic.coverage.thershold <- basic.coverage + geom_hline(yintercept=threshold, colour="red")#Adding the threshold used for data validation
  full.plot <- basic.coverage.thershold + geom_gene(gtf.gr=gtf.gr)
  ggsave(full.plot, file=paste0(gene_name,"_fullplot3.png"), width=20, height = 12,dpi=500)
  
}
Map(coverage.plot, names(loci))

And when generating the subplots, the coverage.plot() display the correct subregion (but not the correct X-axis) but the geom_gene() still display annotations from the start of global region so it does not correspond to what i want.
Exemple of global region plot (correct behaviour):

Image

Exemple of working subplot (because starting at the same position than the global plot):

Image

Exemple of problematic plot:

Image

Is there a way for geom_gene() to adapt to the displayed region rather than defaulting to the global annotation positions and/or am I missing something trivial?
Thanks in advance for your help! 😊

Have a nice day,
Adriano

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions