@@ -501,6 +501,85 @@ differential equations proposed in velocyto [@La_Manno2018-lj] and scVelo
501
501
&= \beta_g u\left(\tau^{\left(k_ {cg}\right)}\right)-\gamma_g s\left(\tau^{\left(k_ {cg}\right)}\right). \label{eq-dsdt}
502
502
\end{align}
503
503
504
+ ## Single-cell data preprocessing {#sec-methods-preprocessing}
505
+
506
+ We used scanpy and scVelo to handle the data input and output; thus, both h5ad
507
+ and loom files generated by velocyto and kallisto [ @Melsted2021-ap ] are
508
+ supported. The fully mature PBMC dataset was processed with the same procedure
509
+ proposed in a review paper [ @Bergen2021-qz ]
510
+ (< https://scvelo.readthedocs.io/perspectives/Perspectives/ > ). We reproduced this
511
+ procedure using the scVelo package and raw read counts of the same top three
512
+ dynamical genes NKG7, IGHM, and GNLY with the best likelihoods. The pancreas
513
+ dataset was processed with scVelo using the following options
514
+
515
+ ``` python
516
+ scv.pp.filter_and_normalize(
517
+ adata = adata,
518
+ min_shared_counts = 30 ,
519
+ n_top_genes = 2000 ,
520
+ )
521
+ scv.pp.moments(
522
+ adata,
523
+ n_pcs = 30 ,
524
+ n_neighbors = 30 ,
525
+ )
526
+ ```
527
+
528
+ The same top variable genes with raw spliced and unspliced read counts were used
529
+ as input for the Pyro&thinsp ; -Velocity model. The original LARRY dataset of in
530
+ vitro Hematopoiesis containing $130,887$ cells was first filtered to remove
531
+ cells without LARRY barcoding. $49,302$ cells were recovered after this step
532
+ with at least one LARRY barcode. For simplicity, we termed this filtered dataset
533
+ with multiple cell fate (multi-fate) as the full dataset. Based on this dataset,
534
+ we created two datasets with uni-fate progression toward monocyte or neutrophil
535
+ based on the lineage LARRY barcodes and time information. Namely, we selected
536
+ sets of cells with a single LARRY barcode, spanning three time points (day 2, 4,
537
+ 6), and all the cells from the last time point (day 6) belong to a unique cell
538
+ type (either monocyte or neutrophil). The two uni-fate datasets were combined to
539
+ represent the bi-fate LARRY dataset. The multi-fate full dataset was processed
540
+ using the same options as the pancreas dataset; the rest of the uni-fate and
541
+ bi-fate datasets were processed using the following parameters
542
+
543
+ ``` python
544
+ scv.pp.filter_and_normalize(
545
+ adata = adata,
546
+ n_top_genes = 2000 ,
547
+ min_shared_counts = 20 ,
548
+ )
549
+ scv.pp.moments(adata)
550
+ ```
551
+
552
+ ## scVelo model {#sec-methods-scvelo}
553
+
554
+ We benchmarked the dynamical RNA velocity model implemented in scVelo ` v0.2.4 `
555
+ for the pancreas and the four LARRY datasets using the same user options
556
+
557
+ ``` python
558
+ scvelo.tl.recover_dynamics(
559
+ data = adata,
560
+ n_jobs = 30 ,
561
+ )
562
+ scvelo.tl.velocity(
563
+ data = adata,
564
+ mode = " dynamical" ,
565
+ )
566
+ ```
567
+
568
+ Then, we tested a set of user options, including the neighboring cell numbers
569
+ and the top variable gene numbers, in the pancreas dataset to explore the
570
+ stability of the scVelo dynamical model. For the fully mature PBMC dataset, we
571
+ followed the notebook proposed by the original authors
572
+ < https://scvelo.readthedocs.io/perspectives/Perspectives/ > , i.e., we used the
573
+ stochastic RNA velocity model implemented in scVelo with the top three
574
+ likelihood genes. The latent time from scVelo was computed using their provided
575
+ function
576
+
577
+ ``` python
578
+ scvelo.tl.latent_time(
579
+ data = adata,
580
+ )
581
+ ```
582
+
504
583
## Trajectory inference {#sec-methods-trajectory-inference}
505
584
506
585
### Velocity vector field
@@ -517,10 +596,10 @@ scvelo.tl.velocity_embedding(
517
596
```
518
597
519
598
We used the default options for projecting the vector fields from scVelo models.
520
- Unlike scVelo, Pyro-Velocity uses statistics derived from posterior samples of
599
+ Unlike scVelo, Pyro& thinsp ; -Velocity uses statistics derived from posterior samples of
521
600
the denoised spliced gene expression and posterior samples of the velocity
522
601
estimation for building the cell state transition matrix estimates using cosine
523
- similarity. Pyro-Velocity uses the same projection method as scVelo for
602
+ similarity. Pyro& thinsp ; -Velocity uses the same projection method as scVelo for
524
603
projecting the transition matrix into the two-dimensional vector field on the
525
604
user-provided embedding space.
526
605
0 commit comments