Skip to content

Results #1

@sh1ng

Description

@sh1ng

Criteo dataset

40M dataset - 5000 threes

python3 src/criteo_speed_test.py xgboost ; python3 src/criteo_speed_test.py lightgbm; python3 src/criteo_speed_test.py arboretum
reading data....
startring benchmark xgboost
2388.4275090694427
roc auc train:0.8894494708973515 cv:0.782368838905777
reading data....
startring benchmark lightgbm
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 1031045, number of negative: 30968955
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 3872
[LightGBM] [Info] Number of data: 32000000, number of used features: 39
[LightGBM] [Info] Using GPU Device: GeForce GTX 1070, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 12
[LightGBM] [Info] 38 dense feature groups (1220.70 MB) transfered to GPU in 0.967707 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.032220 -> initscore=-3.402412
[LightGBM] [Info] Start training from score -3.402412
2522.5609214305878
roc auc train:0.8973754627179997 cv:0.7764168713748687
reading data....
startring benchmark arboretum
feature 0 has been reduced to 15 bits 
feature 1 has been reduced to 13 bits 
feature 2 has been reduced to 10 bits 
feature 3 has been reduced to 15 bits 
feature 4 has been reduced to 12 bits 
feature 5 has been reduced to 10 bits 
feature 6 has been reduced to 9 bits 
feature 7 has been reduced to 14 bits 
feature 8 has been reduced to 10 bits 
feature 9 has been reduced to 4 bits 
feature 10 has been reduced to 8 bits 
feature 11 has been reduced to 19 bits 
feature 12 has been reduced to 11 bits 
max feature size 19 
Total bytes 8513978368 available 2080964608 
Memory usage estimation 220 per record 7040000000 in total 
copied features data 13 from 13 
copied category features 1 from 26 


roc auc train:0.8108035778617945 cv:0.7864887547868098

10M dataset - 5000 threes

reading data....
startring benchmark xgboost
662.0523405075073
roc auc train:0.965875942403954 cv:0.756494298707683
reading data....
startring benchmark lightgbm
[LightGBM] [Warning] Starting from the 2.1.2 version, default value for the "boost_from_average" parameter in "binary" objective is true.
This may cause significantly different results comparing to the previous versions of LightGBM.
Try to set boost_from_average=false, if your old models produce bad results
[LightGBM] [Info] Number of positive: 247552, number of negative: 7752448
[LightGBM] [Info] This is the GPU trainer!!
[LightGBM] [Info] Total Bins 3844
[LightGBM] [Info] Number of data: 8000000, number of used features: 39
[LightGBM] [Info] Using GPU Device: GeForce GTX 1070, Vendor: NVIDIA Corporation
[LightGBM] [Info] Compiling OpenCL Kernel with 256 bins...
[LightGBM] [Info] GPU programs have been built
[LightGBM] [Info] Size of histogram bin entry: 12
[LightGBM] [Info] 38 dense feature groups (305.18 MB) transfered to GPU in 0.311981 secs. 1 sparse feature groups
[LightGBM] [Info] [binary:BoostFromScore]: pavg=0.030944 -> initscore=-3.444143
[LightGBM] [Info] Start training from score -3.444143
805.6181969642639
roc auc train:0.9904928384666543 cv:0.7458449429346047
reading data....
startring benchmark arboretum
feature 0 has been reduced to 14 bits 
feature 1 has been reduced to 13 bits 
feature 2 has been reduced to 9 bits 
feature 3 has been reduced to 14 bits 
feature 4 has been reduced to 12 bits 
feature 5 has been reduced to 9 bits 
feature 6 has been reduced to 9 bits 
feature 7 has been reduced to 13 bits 
feature 8 has been reduced to 9 bits 
feature 9 has been reduced to 4 bits 
feature 10 has been reduced to 8 bits 
feature 11 has been reduced to 19 bits 
feature 12 has been reduced to 10 bits 
max feature size 19 
Total bytes 8513978368 available 7124615168 
Memory usage estimation 180 per record 1440000000 in total 
copied features data 13 from 13 
copied category features 26 from 26 
roc auc train:0.8339605258327059 cv:0.7800383186995146

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions