Skip to content

feat: add third-party evaluation metrics #661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Apr 2, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
325169f
fix(metrics): add external trajectory metrics
cameronraysmith Apr 1, 2025
0f1cc42
fix(tasks): add evaluation task
cameronraysmith Apr 1, 2025
21d3e18
fix(tasks): export evaluation task
cameronraysmith Apr 1, 2025
1c80ccb
fix(workflows): add trajectory evaluation outputs
cameronraysmith Apr 1, 2025
f965992
fix(workflows): add task to evaluate trajectory metrics
cameronraysmith Apr 1, 2025
49e0a76
test(metrics): add smoke tests for trajectory module
cameronraysmith Apr 1, 2025
9edb447
fix(metrics): handle sparse and dense neighbor graph data
cameronraysmith Apr 1, 2025
6bb0eba
test(tasks): add smoke tests for evaluate module
cameronraysmith Apr 1, 2025
832e359
fix(evaluate): accept in-memory objects in addition to file paths
cameronraysmith Apr 1, 2025
f86f056
fix(evaluate): perform all dataframe manipulation outside plot helper
cameronraysmith Apr 1, 2025
912b107
fix(utils): refactor string diff line endings
cameronraysmith Apr 1, 2025
ff996fc
fix(constants): presume developmental cluster orderings
cameronraysmith Apr 2, 2025
71032aa
fix(constants): map model identifiers to velocity layer keys
cameronraysmith Apr 2, 2025
08f7044
fix(evaluate): decompose calculate_cross_boundary_correctness
cameronraysmith Apr 2, 2025
d03e4dc
test(tasks): update evaluate module tests
cameronraysmith Apr 2, 2025
bb99a9e
fix(workflows): use evaluate_trajectory_metrics task
cameronraysmith Apr 2, 2025
65890b2
fix(workflows): increase memory limit for evaluate trajectory metrics
cameronraysmith Apr 2, 2025
9487e4d
fix(tasks): set evaluation data set and model ordering
cameronraysmith Apr 2, 2025
fde8c65
fix(constants): update model ordering
cameronraysmith Apr 2, 2025
3c0fc4a
fix(workflows): set time lineage fate correlation upload path
cameronraysmith Apr 2, 2025
c41bca4
fix(workflows): set trajectory metrics upload path
cameronraysmith Apr 2, 2025
f58f365
chore(workflows): bump trajectory evaluation cache `2024.8.15.1`
cameronraysmith Apr 2, 2025
c9d90b4
fix(tasks): update time fate correlation output directory
cameronraysmith Apr 2, 2025
a1d3429
fix(workflows): update combine time fate correlation task output dire…
cameronraysmith Apr 2, 2025
66cbf09
chore(version): `0.4.2`
cameronraysmith Apr 2, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion MODULE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ pyrovelocity MODULE

module(
name = "pyrovelocity",
version = "0.4.1",
version = "0.4.2",
compatibility_level = 1,
)

Expand Down
4 changes: 2 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -934,8 +934,8 @@ approve-prs: ## Approve github pull requests from bots: PR_ENTRIES="2-5 10 12-18
fi; \
done

PREVIOUS_VERSION := 0.4.0
NEXT_VERSION := 0.4.1
PREVIOUS_VERSION := 0.4.1
NEXT_VERSION := 0.4.2
VERSION_FILES := \
pyproject.toml \
conda/colab/construct.yaml \
Expand Down
8 changes: 4 additions & 4 deletions conda/colab/construct.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: pyrovelocity-colab
version: 0.4.1
version: 0.4.2

channels:
- pytorch
Expand Down Expand Up @@ -89,7 +89,7 @@ specs:
- brotli-python=1.1.0=py311hfdbb021_2
- bzip2=1.0.8=h4bc722e_7
- c-ares=1.34.4=hb9d3cd8_0
- ca-certificates=2025.1.31=hbcca0.4.1
- ca-certificates=2025.1.31=hbcca054_0
- cached-property=1.5.2=hd8ed1ab_1
- cached_property=1.5.2=pyha770c72_1
- cachetools=5.5.2=pyhd8ed1ab_0
Expand Down Expand Up @@ -218,7 +218,7 @@ specs:
- h2=4.2.0=pyhd8ed1ab_0
- h5netcdf=1.6.1=pyhd8ed1ab_0
- h5py=3.13.0=nompi_py311hb639ac4_100
- harfbuzz=10.4.1=h76408a6_0
- harfbuzz=10.4.0=h76408a6_0
- hdf5=1.14.3=nompi_h2d575fe_109
- hicolor-icon-theme=0.17=ha770c72_2
- hpack=4.1.0=pyhd8ed1ab_0
Expand Down Expand Up @@ -294,7 +294,7 @@ specs:
- libcufile=1.13.1.3=0
- libcups=2.3.3=h4637d8d_4
- libcurand=10.3.9.90=0
- libcurl=8.12.1=h332b0.4.1
- libcurl=8.12.1=h332b0f4_0
- libcusolver=11.6.1.9=0
- libcusparse=12.3.1.170=0
- libdeflate=1.23=h4ddbbb0_0
Expand Down
2 changes: 1 addition & 1 deletion containers/gpu.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ COPY . /root
# development
RUN pip install --no-deps -e .
# distribution
# RUN pip install pyrovelocity==0.4.1
# RUN pip install pyrovelocity==0.4.2

ARG tag
ENV FLYTE_INTERNAL_IMAGE $tag
2 changes: 1 addition & 1 deletion containers/pkg.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ COPY . /root
# development
RUN pip install --no-deps -e .
# distribution
# RUN pip install pyrovelocity==0.4.1
# RUN pip install pyrovelocity==0.4.2

ARG tag
ENV FLYTE_INTERNAL_IMAGE $tag
2 changes: 1 addition & 1 deletion docs/source/notebooks/pyrovelocity_colab_template.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@
}
],
"source": [
"pyrovelocity_version = \"0.4.1\"\n",
"pyrovelocity_version = \"0.4.2\"\n",
"pyrovelocity_colab_script_url = (\n",
" \"https://storage.googleapis.com/pyrovelocity/data/scripts/\"\n",
" + f\"pyrovelocity-colab-{pyrovelocity_version}-Linux-x86_64.sh\"\n",
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "pyrovelocity"
version = "0.4.1"
version = "0.4.2"
packages = [{ include = "pyrovelocity", from = "src" }]
description = "A multivariate RNA Velocity model to estimate future cell states with uncertainty using probabilistic modeling with pyro."
authors = ["pyrovelocity team"]
Expand Down
8 changes: 4 additions & 4 deletions scripts/conda
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
set -euo pipefail

PACKAGE_NAME="pyrovelocity"
PACKAGE_VERSION="0.4.1"
PACKAGE_VERSION="0.4.2"
CONDA_BUILD_STRING="pyhff70e4c"
CONDA_BUILD_NUMBER="0"
# CONDA_CHANNEL_LABEL="pyrovelocity_dev"
Expand Down Expand Up @@ -32,7 +32,7 @@ Example:

./conda \\
--name pyrovelocity \\
--version 0.4.1.dev1 \\
--version 0.4.2.dev1 \\
--build-string pyhff70e4c \\
--build-number 0 \\
--label pyrovelocity_dev
Expand Down Expand Up @@ -67,9 +67,9 @@ PACKAGE_SPEC="conda-forge/label/\
$CONDA_CHANNEL_LABEL::\
$PACKAGE_NAME=$PACKAGE_VERSION=$CONDA_BUILD_STRING"_"$CONDA_BUILD_NUMBER"

BLUE="\0.4.1;34;1m"
BLUE="\0.4.2;34;1m"
BOLD="\033[1m"
NO_COLOR="\0.4.1m"
NO_COLOR="\0.4.2m"
if [ "$USE_COLOR" = false ]; then
BLUE=""
BOLD=""
Expand Down
200 changes: 200 additions & 0 deletions src/pyrovelocity/metrics/trajectory.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
"""
Trajectory evaluation metrics for velocity models.

This module contains metrics for evaluating velocity model trajectories,
including directional correctness and coherence measures taken from:

> Qiao C, Huang Y. Representation learning of RNA velocity reveals robust cell
> transitions. Proc Natl Acad Sci U S A. 2021;118. doi:10.1073/pnas.2105859118.
"""

import numpy as np
from anndata import AnnData
from beartype import beartype
from beartype.typing import Dict, List, Tuple, Union
from scipy import sparse
from sklearn.metrics.pairwise import cosine_similarity


@beartype
def keep_type(
adata, nodes: np.ndarray, target: str, k_cluster: str
) -> np.ndarray:
"""
Select cells of given target type.

This implementation is included as a dependency of cross_boundary_correctness.
It is based on the original implementation:
https://github.com/qiaochen/VeloAE/blob/v0.2.0/veloproj/eval_util.py#L28-L41

Args:
adata: AnnData object
nodes: Indexes for cells
target: Cluster name
k_cluster: Cluster key in adata.obs dataframe

Returns:
Selected cells matching the target cluster
"""
return nodes[adata.obs[k_cluster][nodes].values == target]


@beartype
def cross_boundary_correctness(
adata: AnnData,
k_cluster: str,
cluster_edges: List[Tuple[str, str]],
k_velocity: str = "velocity",
return_raw: bool = False,
x_emb: str = "X_umap",
) -> Union[
Dict[Tuple[str, str], List[float]],
Tuple[Dict[Tuple[str, str], float], float],
]:
"""
Cross-Boundary Direction Correctness Score.

Calculates how well velocity vectors point toward neighboring cells in
adjacent clusters, measuring the model's ability to predict correct
developmental trajectories.

Qiao C, Huang Y. Representation learning of RNA velocity reveals robust cell
transitions. Proc Natl Acad Sci U S A. 2021;118. doi:10.1073/pnas.2105859118

This implementation is based on the original implementation:
https://github.com/qiaochen/VeloAE/blob/v0.2.0/veloproj/eval_util.py#L146-L200

Args:
adata: AnnData object
k_cluster: Key to the cluster column in adata.obs DataFrame
cluster_edges: Pairs of clusters with transition direction A->B
k_velocity: Key to the velocity matrix in adata.obsm
return_raw: Whether to return raw cell scores or aggregated scores
x_emb: Key to embedding for visualization

Returns:
Raw cell scores by cluster edge or mean scores by cluster edge and overall mean
"""
if "neighbors" not in adata.uns:
raise ValueError("AnnData object must have neighbors computed")

if "indices" not in adata.uns["neighbors"]:
k = adata.uns["neighbors"]["params"]["n_neighbors"]
connectivities = adata.obsp["connectivities"]

if sparse.issparse(connectivities):
connectivities_array = connectivities.toarray()
else:
connectivities_array = connectivities

neighbor_indices = np.argsort(-connectivities_array, axis=1)[:, :k]
adata.uns["neighbors"]["indices"] = neighbor_indices

scores = {}
all_scores = {}

x_emb_data = adata.obsm[x_emb]

if x_emb == "X_umap":
v_emb = adata.obsm[f"{k_velocity}_umap"]
else:
v_emb = adata.obsm[
[key for key in adata.obsm if key.startswith(k_velocity)][0]
]

for u, v in cluster_edges:
sel = adata.obs[k_cluster] == u
nbs = adata.uns["neighbors"]["indices"][sel] # [n * 30]

boundary_nodes = map(
lambda nodes: keep_type(adata, nodes, v, k_cluster), nbs
)
x_points = x_emb_data[sel]
x_velocities = v_emb[sel]

type_score = []
for x_pos, x_vel, nodes in zip(x_points, x_velocities, boundary_nodes):
if len(nodes) == 0:
continue

position_dif = x_emb_data[nodes] - x_pos
dir_scores = cosine_similarity(
position_dif, x_vel.reshape(1, -1)
).flatten()
type_score.append(np.mean(dir_scores))

scores[(u, v)] = np.mean(type_score) if type_score else 0.0
all_scores[(u, v)] = type_score

if return_raw:
return all_scores

return scores, np.mean([sc for sc in scores.values()]) if scores else 0.0


@beartype
def inner_cluster_coherence(
adata, k_cluster: str, k_velocity: str, return_raw: bool = False
) -> Union[Dict[str, List[float]], Tuple[Dict[str, float], float]]:
"""
In-cluster Coherence Score.

Measures how aligned velocity vectors are within the same cluster,
indicating consistency in predicted cellular trajectories.

Qiao C, Huang Y. Representation learning of RNA velocity reveals robust cell
transitions. Proc Natl Acad Sci U S A. 2021;118. doi:10.1073/pnas.2105859118

This implementation is based on the original implementation:
https://github.com/qiaochen/VeloAE/blob/v0.2.0/veloproj/eval_util.py#L203-L237

Args:
adata: AnnData object
k_cluster: Key to the cluster column in adata.obs DataFrame
k_velocity: Key to the velocity matrix in adata.layers
return_raw: Whether to return raw scores or aggregated scores

Returns:
Raw scores by cluster or mean scores by cluster and overall mean
"""
if "neighbors" not in adata.uns:
raise ValueError("AnnData object must have neighbors computed")

if "indices" not in adata.uns["neighbors"]:
k = adata.uns["neighbors"]["params"]["n_neighbors"]
connectivities = adata.obsp["connectivities"]

if sparse.issparse(connectivities):
connectivities_array = connectivities.toarray()
else:
connectivities_array = connectivities

neighbor_indices = np.argsort(-connectivities_array, axis=1)[:, :k]
adata.uns["neighbors"]["indices"] = neighbor_indices

clusters = np.unique(adata.obs[k_cluster])
scores = {}
all_scores = {}

for cat in clusters:
sel = adata.obs[k_cluster] == cat
nbs = adata.uns["neighbors"]["indices"][sel]
same_cat_nodes = map(
lambda nodes: keep_type(adata, nodes, cat, k_cluster), nbs
)
velocities = adata.layers[k_velocity]
cat_vels = velocities[sel]

cat_score = [
cosine_similarity(cat_vels[[ith]], velocities[nodes]).mean()
for ith, nodes in enumerate(same_cat_nodes)
if len(nodes) > 0
]

all_scores[cat] = cat_score
scores[cat] = np.mean(cat_score) if cat_score else 0.0

if return_raw:
return all_scores

return scores, np.mean([sc for sc in scores.values()]) if scores else 0.0
3 changes: 2 additions & 1 deletion src/pyrovelocity/tasks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
import pyrovelocity.tasks.data
import pyrovelocity.tasks.evaluate
import pyrovelocity.tasks.postprocess
import pyrovelocity.tasks.preprocess
import pyrovelocity.tasks.summarize
import pyrovelocity.tasks.train


__all__ = [
"data",
"evaluate",
"postprocess",
"preprocess",
"summarize",
Expand Down
Loading
Loading