Skip to content

Commit 08c15db

Browse files
authored
Update README.md
Updated to reflect a all input arguments needed for each part of the pipeline. Signed-off-by: Ian M. B. <99409346+iPsychonaut@users.noreply.github.com>
1 parent 045c84e commit 08c15db

File tree

1 file changed

+15
-9
lines changed

1 file changed

+15
-9
lines changed

README.md

Lines changed: 15 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
12
# FunDiS Pipeline
23

34
FunDiS Pipeline is a suite of scripts intended to streamline the processing of Next-Generation Sequencing (NGS) data. The scripts can be run individually or as a whole to form a complete pipeline.
@@ -11,6 +12,11 @@ This application is designed to be run on a Linux/WSL environment and requires t
1112
- pandas
1213
- pysam
1314
- biopython
15+
- multiprocessing
16+
- math
17+
- queue
18+
- glob
19+
- shutil
1420

1521
The application also relies on the following tools:
1622

@@ -34,7 +40,7 @@ Each module of the pipeline can be run individually or as a whole.
3440
To run the whole pipeline, use the `fundis_main.py` script. For example:
3541

3642
```
37-
python /path/to/fundis_main.py -i /path/to/input.fastq -t /path/to/primers.txt -p 80
43+
python /path/to/fundis_main.py -i /path/to/input.fastq -x /path/to/index.txt -t /path/to/primers.txt -p 80
3844
```
3945

4046
### Running Individual Modules
@@ -44,28 +50,28 @@ Each module can also be run individually. Here's what each module does:
4450
- **fundis_minibar_ngsid.py**: This script processes the input FASTQ file with MiniBar and NGSpeciesID. MiniBar is a tool for demultiplexing barcoded read data and NGSpeciesID is a tool used for the identification of specimens in NGS datasets. The script starts by checking the operating system, installing missing libraries, and setting up the working environment. It then moves on to demultiplexing and identifying species from the input FASTQ data. The results are output in a directory specified by the user.
4551

4652
```
47-
python /path/to/fundis_minibar_ngsid.py -i /path/to/input.fastq -t /path/to/primers.txt -p 80
53+
python /path/to/fundis_minibar_ngsid.py -i /path/to/input.fastq -x /path/to/index.txt -t /path/to/primers.txt -p 80
4854
```
4955

5056
- **fundis_haplotype_phaser.py**: This script takes the output from the `fundis_minibar_ngsid.py` script and phases the haplotypes for each sample. Phasing is the process of determining the specific set of variants found on each physical copy of a particular gene or genomic region. The phased haplotypes are output in the NGSpeciesID output directory.
5157

5258
```
53-
python /path/to/fundis_haplotype_phaser.py -i /path/to/input_directory -p 80
59+
python /path/to/fundis_haplotype_phaser.py -i /path/to/input_dir -p 80
5460
```
5561

56-
- **fundis_summarize2.py**: This script summarizes the output from the `fundis_haplotype_phaser.py` script. It provides a summary of the results, including counts of unique samples, total consensus sequences, and total reads in consensus sequences. It also copies and updates the names of all FASTQ and consensus FASTA files. The results are output in a summary directory named after the source directory.
62+
- **fundis_summarize.py**: This script summarizes the output from the `fundis_haplotype_phaser.py` script. It provides a summary of the results, including counts of unique samples, total consensus sequences, and total reads in consensus sequences. It also copies and updates the names of all FASTQ and consensus FASTA files. The results are output in a summary directory named after the source directory.
5763

5864
```
59-
python /path/to/fundis_summarize2.py -s /path/to/source_folder -p 80
65+
python /path/to/fundis_summarize.py -i /path/to/input_dir -p 80
6066
```
6167

6268
## Arguments
6369

64-
- `-i`, `--input_fastq`: Path to the FASTQ file containing ONT nrITS data.
70+
- `-i`, `--input_fastq` or `--input_dir`: Path to the FASTQ file containing ONT nrITS data or path to the directory containing the data.
6571
- `-t`, `--primers_text_path`: Path to Text file containing the Primers used to generate input_fastq.
66-
- `-p`, `--percent_system_use`: Percent system use written as integer.
67-
- `-s`, `--source_folder`: Path to the source folder.
72+
- '-x', '--minbar_index_path', type=str, help='Path to Text file containing the minibar index to parse input_fastq.
73+
- `-p`, `--percent_system_use`: Percent system use written as an integer.
6874

6975
## Author
7076

71-
Ian Michael Bollinger (ian.michael.bollinger@gmail.com)
77+
Ian Michael Bollinger (ian.michael.bollinger@gmail.com/researchconsultants@critical.consulting)

0 commit comments

Comments
 (0)