scMetric
is an R package that applies a metric learning algorithm to scRNA-seq data. It allows users to give weakly annotated samples to tell expected angle they would use to analyze the data, and the package learns the metric from the examples and apply the metric for downstream clustering and visualization. The package also outputs the genes that are weighted as more important in learned metric.
For more information, please refer to the manuscript by Wenchang Chen and Xuegong Zhang.
The package is developed . Users should . To install the developmental version from GitHub:
if(!require(devtools)) install.packages("devtools")
devtools::install_github("chenwenchang/scMetric", build_vignettes = TRUE)
To load the installed scMetric
in R:
library(scMetric)
scMetric
takes 7 inputs:
X
: a scRNA-seq gene expression matrix, cells for rows and genes for columnslabel
: a vector specifying which group cells belong to,corresponding to rows in X.constraints
: weak supervision information, a few pairs of cells along with whether they are similar or notnum_constraints
: total number of similar and dissimilar pairs that are usedthresh
: threshold that decides when metric learning iteration stops. Default: 0.01max_iters
: max iterations of metric learning. Default: 100000draw_tSNE
: whether to draw tSNE plot or not
If users provide constraints
themselves, the input label
is used for visualization only. If users want scMetric
to select constraints
automatically, then label
is used for selecting similar and dissimilar pairs. Cells that have the same label are similar. Otherwise, they are dissimilar.
Default num_constraints
value is 100. Users should give a number for particular use.
Users can load the test data in scMetric
by
library(scMetric)
data(testData)
The toy data counts
in testData
is a scRNA-seq read counts matrix which has 1000 cells (rows) and 1000 genes (columns). The object label1
and label2
are two vectors specifying two kinds of grouping.
Here is an example to run scMetric
with read counts matrix input:
# Load library and the test data for DEsingle
library(scMetric)
data(testData)
# Learning metric using label1 as similarity
res <- scMetric(counts, label = label1, num_constraints = 50, thresh = 0.1, draw_tSNE = TRUE)
scMetric
outputs 4 objects:
newData
: new data based on new metric which can be used for downstream analysisnewMetric
: learned metric, a d by d matric where d represents genes numbersconstraints
: constraints whichscMetric
usessortGenes
: genes sorted by importance score
- Wenchang Chen - wrote
scMetric
and analyzed data - Xuegong Zhang - planned the study
This work is supported by CZI HCA pilot project, the National Key R&D Program of China grant 2018YFC0910400 and the NSFC grant 61721003.
Jason V. Davis, Brian Kulis, Prateek Jain, Suvrit Sra, and Inderjit S. Dhillon. "Information-theoretic Metric Learning." Proc. 24th International Conference on Machine Learning (ICML), 2007.