Skip to content

Commit 0f21027

Browse files
author
Ganyu Teng
committed
Update README to include evaluation command line, and udpate format
1 parent 6f39508 commit 0f21027

File tree

3 files changed

+11
-5
lines changed

3 files changed

+11
-5
lines changed

README.md

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ Overwrite pyod version to avoid bugs
5252
pip install pyod==2.0.1
5353
```
5454

55-
## Reproduce our experimental results
55+
## Rerun our experiments
5656

5757
1. Download the following datasets from Kaggle and put them to ``data/[dataset_name]/``
5858
- [vifd](https://www.kaggle.com/datasets/khusheekapoor/vehicle-insurance-fraud-detection/data) (Vehicle Insurance Fraud Detection)
@@ -68,11 +68,11 @@ pip install pyod==2.0.1
6868
bash scripts/exp4-model_size/run_anollm_1.7B_odds.sh
6969
```
7070

71-
### Using your own dataset
71+
## Using your own datasets
7272

7373
To use a custom dataset, create a dataframe with the following structure: ``{feature_name:feature_values}``. Please refer to ``load_dataset()`` function in ``src/data_utils.py`` for further guidance.
7474

75-
## Training Models
75+
### Training Models
7676

7777
For AnoLLM, we use the following command:
7878

@@ -87,9 +87,15 @@ For baselines, we use the following command:
8787
CUDA_VISIBLE_DEVICES=0 python evaluate_baselines.py --dataset $dataset --n_splits $n_splits --normalize --setting semi_supervised --split_idx $split_idx
8888
```
8989

90-
Check the argument parser in ``train_anollm.py`` for options for datasets
90+
Check the argument parser in ``evaluate_baselines.py`` for options for datasets
91+
92+
### Evaluation
93+
94+
To evaluate AnoLLM, we use the following command:
95+
```
96+
CUDA_VISIBLE_DEVICES=$INFERENCE_GPUS torchrun --nproc_per_node=$n_test_node evaluate_anollm.py --dataset $dataset --n_splits $n_splits --split_idx 0 --setting semi_supervised --batch_size $eval_batch_size --n_permutations $n_permutations --model $model --binning standard
97+
```
9198

92-
## Evaluation
9399
We evaluate the quality of synthetic data using metrics from various aspects.
94100
```
95101
python src/get_results.py --dataset $dataset --n_splits $n_splits --setting semi_supervised

scripts/.DS_Store

-6 KB
Binary file not shown.

src/.DS_Store

-6 KB
Binary file not shown.

0 commit comments

Comments
 (0)