You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -11,50 +11,68 @@ peakScout is a user-friendly and reversible peak-to-gene translator for genomic
11
11
12
12
## Overview
13
13
14
-
PeakScout is a bioinformatics tool designed to bridge the gap between genomic peak data and gene annotations, enabling researchers to understand the relationship between regulatory elements and their target genes. At its core, peakScout processes genomic peak files from common peak callers like MACS2 and SEACR and maps them to nearby genes using reference genome annotations. The workflow begins with input processing, where peak files are standardized and reference GTF files are decomposed into chromosome-specific feature collections. The core analysis modules then perform bidirectional mapping: peak-to-gene identifies which genes are potentially regulated by specific genomic regions, while gene-to-peak reveals which regulatory elements might influence particular genes. Throughout this process, nearest-feature detection algorithms handle the complex spatial relationships between genomic elements, considering factors like distance constraints and feature overlaps. Finally, the results are formatted into researcher-friendly CSV and Excel outputs, providing a comprehensive view of the genomic landscape that connects regulatory elements to their potential gene targets.
14
+
PeakScout is a bioinformatics tool designed to bridge the gap between genomic peak data and gene annotations, enabling researchers to understand the relationship between regulatory elements and their target genes. At its core, peakScout processes genomic peak files generated by popular peak callers like MACS2 and SEACR and maps them to nearby genes using reference genome annotations.
15
15
16
-
## Installation
16
+
peakScount performs:
17
+
-**Peak-to-Gene Mapping**: This function identifies the nearest genes to each peak, allowing researchers to infer which genes might be regulated by specific genomic regions. Users can specify how many nearest genes (k) they want to retrieve for each peak.
18
+
-**Gene-to-Peak Mapping**: Conversely, this function finds the nearest peaks to a list of genes, helping researchers identify potential regulatory elements that may influence gene expression.
17
19
18
-
These instructions should generally work without modification in linux-based environments. If you are using Windows, we strongly recommend you use WSL2 to have a Linux environment within Windows.
20
+
peakScout expects two inputs:
21
+
1. A **peak file** (in BED6 format or as output from MACS2 or SEACR)
22
+
2. A **reference GTF file** containing gene annotations. The tool can decompose the GTF file into chromosome-specific collections of genomic features, which are then used to perform bidirectional mapping between peaks and genes.
19
23
20
-
### 1. Clone the Repository
24
+
peakScount can be run via:
25
+
-**Command line**: peakScout is designed to be run from the command line, making it accessible for users comfortable with terminal operations.
26
+
-**Cloud computing**: for instanct access web access, we have set up peakScout in the cloud - https://vandydata.github.io/peakScout.
Alternatively, edit your `~/.bashrc` to make this change permanent, but be sure to include the complete path in the file itself, not the `$(pwd)`.
32
+
### From source
38
33
39
-
### 4. Set Up Virtual Environment
34
+
These instructions should generally work without modification in linux-based environments. If you are using Windows, we strongly recommend you use WSL2 to have a Linux environment within Windows.
peakScout gene2peak --peak_file /path/to/peak/file --peak_type MACS2/SEACR/BED6 --gene_file /path/to/gene/file --species species of gtf --k number of nearest peaks --ref_dir /path/to/reference/directory --output_name name of output file --o /path/to/save/output --output_type csv/xslx
126
144
```
127
145
128
-
## Decomposed references for common organisms
129
-
130
-
For your convenience, we have prepared decomposed GTF files for common organisms, generated by `src/utils/decompose-common-organisms.sh` from the following GTF files:
## peakScout ready-made references for common organisms
143
147
144
-
You can download these from a publix AWS S3 bucket as ZSTD-compressed files, which can be decompressed with `zstd -d`:
148
+
For your convenience, we have prepared reference files for common organisms, generated by `src/utils/decompose-common-organisms.sh`. Source files are the GTFs below and downloadable peakScout reference files are the S3 links.
0 commit comments