You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updated to reflect a all input arguments needed for each part of the pipeline.
Signed-off-by: Ian M. B. <99409346+iPsychonaut@users.noreply.github.com>
Copy file name to clipboardExpand all lines: README.md
+15-9Lines changed: 15 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,3 +1,4 @@
1
+
1
2
# FunDiS Pipeline
2
3
3
4
FunDiS Pipeline is a suite of scripts intended to streamline the processing of Next-Generation Sequencing (NGS) data. The scripts can be run individually or as a whole to form a complete pipeline.
@@ -11,6 +12,11 @@ This application is designed to be run on a Linux/WSL environment and requires t
11
12
- pandas
12
13
- pysam
13
14
- biopython
15
+
- multiprocessing
16
+
- math
17
+
- queue
18
+
- glob
19
+
- shutil
14
20
15
21
The application also relies on the following tools:
16
22
@@ -34,7 +40,7 @@ Each module of the pipeline can be run individually or as a whole.
34
40
To run the whole pipeline, use the `fundis_main.py` script. For example:
@@ -44,28 +50,28 @@ Each module can also be run individually. Here's what each module does:
44
50
-**fundis_minibar_ngsid.py**: This script processes the input FASTQ file with MiniBar and NGSpeciesID. MiniBar is a tool for demultiplexing barcoded read data and NGSpeciesID is a tool used for the identification of specimens in NGS datasets. The script starts by checking the operating system, installing missing libraries, and setting up the working environment. It then moves on to demultiplexing and identifying species from the input FASTQ data. The results are output in a directory specified by the user.
-**fundis_haplotype_phaser.py**: This script takes the output from the `fundis_minibar_ngsid.py` script and phases the haplotypes for each sample. Phasing is the process of determining the specific set of variants found on each physical copy of a particular gene or genomic region. The phased haplotypes are output in the NGSpeciesID output directory.
-**fundis_summarize2.py**: This script summarizes the output from the `fundis_haplotype_phaser.py` script. It provides a summary of the results, including counts of unique samples, total consensus sequences, and total reads in consensus sequences. It also copies and updates the names of all FASTQ and consensus FASTA files. The results are output in a summary directory named after the source directory.
62
+
-**fundis_summarize.py**: This script summarizes the output from the `fundis_haplotype_phaser.py` script. It provides a summary of the results, including counts of unique samples, total consensus sequences, and total reads in consensus sequences. It also copies and updates the names of all FASTQ and consensus FASTA files. The results are output in a summary directory named after the source directory.
0 commit comments