Fix dataset description itemizations (#471)

bittremieux · web-flow · commit c469bb1506d4 · 2025-05-15T11:19:55.000+02:00
diff --git a/docs/faq.md b/docs/faq.md
@@ -108,11 +108,13 @@ These PSMs were obtained by collecting up to the top 100 PSMs for each of the pr
 The data are available as annotated MGF files on Zenodo:
     - [MassIVE-KB v1 30 million PSMs](https://doi.org/10.5281/zenodo.14973856)
     - [MassIVE-KB v1 30 million PSMs training/validation/test splits](https://doi.org/10.5281/zenodo.14967861)
-The dataset was originally compiled through the following steps:
-    - On the [MassIVE website](https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp), go to [MassIVE Knowledge Base](https://massive.ucsd.edu/ProteoSAFe/static/massive-kb-libraries.jsp) > [Human HCD Spectral Library](https://massive.ucsd.edu/ProteoSAFe/status.jsp?task=82c0124b6053407fa41ba98f53fd8d89) > [All Candidate library spectra](https://massive.ucsd.edu/ProteoSAFe/result.jsp?task=82c0124b6053407fa41ba98f53fd8d89&view=candidate_library_spectra) > Download.
+
+    The dataset was originally compiled through the following steps:
+   
+   - On the [MassIVE website](https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp), go to [MassIVE Knowledge Base](https://massive.ucsd.edu/ProteoSAFe/static/massive-kb-libraries.jsp) > [Human HCD Spectral Library](https://massive.ucsd.edu/ProteoSAFe/status.jsp?task=82c0124b6053407fa41ba98f53fd8d89) > [All Candidate library spectra](https://massive.ucsd.edu/ProteoSAFe/result.jsp?task=82c0124b6053407fa41ba98f53fd8d89&view=candidate_library_spectra) > Download.
     - This will give you a zipped TSV file with the metadata and peptide identifications for all 30 million PSMs.
     - Using the filename (column "filename") you can then retrieve the corresponding peak files from the MassIVE FTP server (done using a wget script) and extract the desired spectra using their scan number (column "scan").
-2. The _non-enzymatic dataset_, used to train a non-tryptic version of Casanovo, was created by selecting PSMs with a uniform distribution of amino acids at the C-terminal peptide positions from two datasets: MassIVE-KB and PROSPECT.
+3. The _non-enzymatic dataset_, used to train a non-tryptic version of Casanovo, was created by selecting PSMs with a uniform distribution of amino acids at the C-terminal peptide positions from two datasets: MassIVE-KB and PROSPECT.
 Training, validation, and test splits for the non-enzymatic dataset are available as annotated MGF files on MassIVE with [dataset identifier MSV000094014](https://doi.org/doi:10.25345/C5KS6JG0W).
 
 **Casanovo v4.2**