-
Notifications
You must be signed in to change notification settings - Fork 17
Fine mapping TWAS associations
The main aim of FOCUS is to fine-map TWAS associations at GWAS risk regions. FOCUS takes as input 1) GWAS summary statistics, 2) reference LD, and 3) eQTL weight database. Given these data, FOCUS can finemap in a tissue-agnostic or tissue-prioritized approach.
The basic command for fine-mapping is
focus finemap SUMSTATS PLINK_REFLD WEIGHT_DB
where SUMSTATS
is the GWAS summary file, PLINK_REFLD
is the path to PLINK-formatted genotype data for computing reference LD, and WEIGHT_DB
is the path to a FOCUS weight database. Help on all the options and functionality can be listed by entering
focus finemap --help
For example, the command to perform tissue-agnostic fine-mapping on chromosome 1 for GWAS summary data LDL_2010.clean.sumstats.gz
using 1000G.EUR.QC.1
reference genotypes, and gtex_v7.db
eQTL weights is given as,
focus finemap LDL_2010.clean.sumstats.gz 1000G.EUR.QC.1 gtex_v7.db --chr 1 --out LDL_2010.chr1
This command will scan LDL_2010.clean.sumstats.gz
for risk regions and then perform TWAS+fine-mapping using LD estimated from plink-formatted 1000G.EUR.QC.1
and eQTL weights from gtex_v7.db
.
To take the tissue-prioritized approach the flag --tissue TISSUE
is added
focus finemap LDL_2010.clean.sumstats.gz 1000G.EUR.QC.1 gtex_v7.db --chr 1 --tissue LIVER --out LDL_2010.chr1
FOCUS has the ability to generate a figure for each region that contains the predicted expression correlation, TWAS summary statistics and PIP for each gene. To do this add the --plot
flag.
focus finemap LDL_2010.clean.sumstats.gz 1000G.EUR.QC.1 gtex_v7.db --chr 1 --tissue LIVER --plot --out LDL_2010.chr1
Here is an example image illustrating the local correlation structure, TWAS p-values, and PIPs for each model
The output from the finemap operation is a table:
Column | Description |
---|---|
ens_gene_id | Ensembl gene ID |
ens_tx_id | Ensemble transcript ID |
mol_name | Name of the gene/linc/pseudogene |
tissue | Tissue the original expression was measured in |
ref_name | Name of the QTL reference panel |
type | Type of molecular feature (gene, lncRNA, lincRNA, pseudogene) |
chrom | Chromosome |
tx_start | Transcription start site |
tx_stop | Transcription stop site |
inference | Inference procedure for model (e.g., LASSO, BSLMM) |
cv.R2 | Cross-validation predictive Rsquared |
cv.R2.pval | P-value of the Cross-validation |
twas_z | Marginal TWAS Z score |
pip | Marginal posterior inclusion probability |
in_cred_set | Flag indicating whether or not model is included in the credible set |
region | Identifier for the genomic region |
We recommend using reference LD from LDSC.
We recommend using a multiple tissue, multiple eQTL reference panel weight database here. This combines GTExv7 weights from PrediXcan with METSIM, NTR, YFS, and CMC weights from FUSION software into a single usable database for FOCUS.