Enlace: youtu.be/vpgKecMkcTk
- Descargar el dataset de https://www.kaggle.com/datasets/najzeko/steam-reviews-2021
- Guardar y descomprimir en un path conocido.
- Implementar el código según el enunciado https://concurrentes-fiuba.github.io/2025_1C_tp1.html
cargo run --release <input-path> <num-threads> <output-file-name>
por ejemplo
cargo run --release ~/Downloads/dataset 4 output.json
- La salida de la ejecución con el dataset completo debe ser igual a la del archivo
expected_output.json
, sin importar el orden de aparición de las keys en los mapas.
To run all tests (unit tests and integration tests):
cargo test
There is an integration test marked as ignored because it does not use the testfiles but rather the full CSV from the dataset. It is located at tests/tests_over_full_csv.rs
.
To run this test you will need to update the variable FULL_CSV_DIRPATH
with the directory path that contains a single full CSV from the dataset. The output will be checked against the provided expected JSON output.
To run it execute:
cargo test --release -- --ignored
You can read the project documentation that is located in the docs
folder in this path:
docs/doc/steam_review_parser_tp1/index.html
Small utility to run batches of timed runs for different file sizes of the CsvParser
library. Was used along the project to track changes in performance.
It is heavy to run because it will run various tests with different number of threads with fixed locations of datasets that are stores in variables in the code. You can run it, after configuring the folders with the next command:
cargo run --bin bench --release
Install pre-commit hooks:
pre-commit install
To run all pre-commit hooks without making a commit run:
pre-commit run --all-files
To create a commit without running the pre-commit hooks run:
git commit --no-verify
To build the documentation for the project run:
cargo doc --no-deps --document-private-items --verbose --open --color always --release --target-dir ./docs