Skip to content

CSI Implementation for Larger Genomes #414

@emilytrybulec

Description

@emilytrybulec

Description of feature

Hi,

I've been helping my coworker run this pipeline, but we hit an error with SAMTOOLS_INDEX, which I see isn't that uncommon. Using ext.args, we were able to resolve the error, but only a partial run followed. The last step that ran at that point was MERGED_LIBRARY_MARKDUPLICATES_PICARD, so I was able to git clone and modify the atacseq.nf at line 480 to read
.join(MERGED_LIBRARY_MARKDUPLICATES_PICARD.out.csi, by: [0])

This change allowed for more of the pipeline to run, but it still did not complete. This time, the pipeline was halted at the bam_filter_bamtools.nf subworkflow. I modified line 89 to read
bai = BAM_SORT_STATS_SAMTOOLS.out.csi.mix(SAMTOOLS_INDEX.out.csi) // channel: [ val(meta), [ bai ] ]

These changes allowed for the pipeline to complete using the csi index, and I think it'd be useful to implement these as more global changes. If I didn't have a thorough Nextflow background, I don't think we would be able to use the complete pipeline for this species, so I think making adjustments to allow for easier csi usage would benefit the community as a whole! Let me know if you'd like any additional info!

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions