-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
Criteo dataset
40M dataset - 5000 threes
python3 src/criteo_speed_test.py xgboost ; python3 src/criteo_speed_test.py lightgbm; python3 src/criteo_speed_test.py arboretum
reading data....
startring benchmark xgboost
2388.4275090694427
roc auc train:0.8894494708973515 cv:0.782368838905777
reading data....
startring benchmark lightgbm
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 1031045, number of negative: 30968955
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 3872
[LightGBM] [Info] Number of data: 32000000, number of used features: 39
[LightGBM] [Info] Using GPU Device: GeForce GTX 1070, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 12
[LightGBM] [Info] 38 dense feature groups (1220.70 MB) transfered to GPU in 0.967707 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.032220 -> initscore=-3.402412
[LightGBM] [Info] Start training from score -3.402412
2522.5609214305878
roc auc train:0.8973754627179997 cv:0.7764168713748687
reading data....
startring benchmark arboretum
feature 0 has been reduced to 15 bits
feature 1 has been reduced to 13 bits
feature 2 has been reduced to 10 bits
feature 3 has been reduced to 15 bits
feature 4 has been reduced to 12 bits
feature 5 has been reduced to 10 bits
feature 6 has been reduced to 9 bits
feature 7 has been reduced to 14 bits
feature 8 has been reduced to 10 bits
feature 9 has been reduced to 4 bits
feature 10 has been reduced to 8 bits
feature 11 has been reduced to 19 bits
feature 12 has been reduced to 11 bits
max feature size 19
Total bytes 8513978368 available 2080964608
Memory usage estimation 220 per record 7040000000 in total
copied features data 13 from 13
copied category features 1 from 26
roc auc train:0.8108035778617945 cv:0.7864887547868098
10M dataset - 5000 threes
reading data....
startring benchmark xgboost
662.0523405075073
roc auc train:0.965875942403954 cv:0.756494298707683
reading data....
startring benchmark lightgbm
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 247552, number of negative: 7752448
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 3844
[LightGBM] [Info] Number of data: 8000000, number of used features: 39
[LightGBM] [Info] Using GPU Device: GeForce GTX 1070, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 12
[LightGBM] [Info] 38 dense feature groups (305.18 MB) transfered to GPU in 0.311981 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.030944 -> initscore=-3.444143
[LightGBM] [Info] Start training from score -3.444143
805.6181969642639
roc auc train:0.9904928384666543 cv:0.7458449429346047
reading data....
startring benchmark arboretum
feature 0 has been reduced to 14 bits
feature 1 has been reduced to 13 bits
feature 2 has been reduced to 9 bits
feature 3 has been reduced to 14 bits
feature 4 has been reduced to 12 bits
feature 5 has been reduced to 9 bits
feature 6 has been reduced to 9 bits
feature 7 has been reduced to 13 bits
feature 8 has been reduced to 9 bits
feature 9 has been reduced to 4 bits
feature 10 has been reduced to 8 bits
feature 11 has been reduced to 19 bits
feature 12 has been reduced to 10 bits
max feature size 19
Total bytes 8513978368 available 7124615168
Memory usage estimation 180 per record 1440000000 in total
copied features data 13 from 13
copied category features 26 from 26
roc auc train:0.8339605258327059 cv:0.7800383186995146
Metadata
Metadata
Assignees
Labels
No labels