Near-uniform Aggregation of Gradient Boosting Machines for KDD Cup 2015

and more...

CLMS, National Taiwan University

Ming-Lun Cai, Chih-Wei Chang, Liang-Wei Chen, Si-An Chen,
Hsien-Chun Chiu, Hong-Min Chu, Yu-Jheng Fang, Yi Huang,
Kuan-Hao Huang, Chih-Te Lai, Yi-An Lin, Chieh-En Tsai, Yeh-Wen Tsao,
Yu-Lin Tsou, Wei-Cheng Wang, Yu-Ping Wu, Yao-Yuan Yang,
Sheng-Chi You, Sz-Han Yu, Hsuan-Tien Lin and Shou-De Lin

Approaches

Data Set

training : test = 3 : 2

Validation(2/2)

sub-training : validation = 4 : 1

A General Framework

Feature Extraction

Basic Feature (2/2)

Leak Feature (2/2)

Label Based Feature (1/2)

  • include label information
    • could be risky
    • consider only labels from other instances
  • example:
    • # of dropped courses for this user
    • drop rate on other courses for this user

Label Based Feature (2/2)

Single Model

model validation public private
gradient boosting 0.907365 0.907532 0.905854
random forest 0.905666 0.907497 0.905588
neural network 0.905160 0.904746 0.902830
adaptive boosting 0.904177 N/A N/A
logistic regression 0.902474 N/A N/A

Ensemble Framework

Validation-set Blending (2/3)

  • blend as training data, validation as validation data

Validation-set Blending (3/3)

  • linear large-scale rankSVM from LIBSVM Tools
    • optimize AUC
    • no significant benifit with non-linear models
  • combining over 70 different models from 5 sub-teams
    • does't perform well with all models blended
    • heuristic model (feature) selection

Comparisons

Validation-set Blending Test-set Blending
pairwise hinge loss square error to
approximate AUC
easier to optimize
under more control
directly exploits
leaderboard information
smaller training set need public score

Results (1/2)

Public Private
Best Single Model 0.907532475 0.905853623
Validation-set Blending 0.908343215 0.906487001
Test-set Blending 0.908204930 0.906601438

Results (2/2)

Near-uniform Aggregation of Gradient Boosting Machines

  • near-uniform: weight vector in the ridge regression after test-set blending is near-uniform
  • selected models are all GBM models

Conditional Blending

  • combine uniformly, predictions result in same prediction values
  • the instances that model 1 and 2 can't decide, introduce model 3 to decide

Scores

Public Private
Best Single Model 0.907532475 0.905853623
Validation-set Blending 0.908343215 0.906487001
Test-set Blending 0.908204930 0.906601438
Non-risky 0.905802465 0.903825326
2-Level Conditional Blending 0.908572416 0.906612375
3-Level Conditional Blending 0.908541224 0.906632903

Actual Best Private Score

  • another test-set blending result
  • nearly uniform blending of 2 validation-set blending results and 5 GBM models
Public Private
Test-set Blending 0.908204930 0.906601438
Highest Public Score 0.908572416 0.906612375
3-Level Conditional Blending 0.908541224 0.906632903
Best Private 0.908370011 0.906652903