xgboost调参函数(GridSearchCV的使用)

转并整理自[xgboost调参](https://segmentfault.com/a/1190000014040317)，非常实用

1.简介

调参用的是sklearn的GridSearchCV，其中，x_train和y_train分别指的训练集的特征和标签。

在调参之前，有一个点需要注意。如果直接导入x_train和y_train，会报错，报too many indices in the array，所以需要：

# 链接：https://stackoverflow.com/questions/42928855/gridsearchcv-error-too-many-indices-in-the-array
# 
c, r = y_train_array.shape
y_train_ = y_train_array.reshape(c,)  # 只能识别（shape,）而不能（shape,1）

之后可运行函数，进行调参了。首先呈现函数，详细代码如下：

from sklearn.model_selection import GridSearchCV

def Tuning(cv_params, other_params):
    model2 = XGBClassifier(**other_params)
    optimized_GBM = GridSearchCV(estimator=model2, 
                                 param_grid=cv_params, 
                                 cv=5, 
                                 n_jobs=4)
    optimized_GBM.fit(x_train_array, y_train_)
    evalute_result = optimized_GBM.grid_scores_
    print('每轮迭代运行结果:{0}'.format(evalute_result))
    print('参数的最佳取值：{0}'.format(optimized_GBM.best_params_))
    print('最佳模型得分:{0}'.format(optimized_GBM.best_score_))

2.调参

此后，只需要往Tuning中输入两个参数，cv_params，和other_params，前面指的你要调优的参数，后面指的其他的参数。
xgboost参数简介见xgboost文档，常用参数

如对n_estimators进行调节，则：

cv_params = {'n_estimators':[700,800,900,1000,1100,1200,1300,1400]}
other_params = {
    'scale_pos_weight':scale_pos_weight,
    'eta':0.3,
    'learning_rate':0.07,
    'n_estimators':700,# 需调参 
    'max_depth':6, # 需调参 
    'min_child_weight':1, # 需调参 
    'gamma':0.2, # 需调参 
    'subsample':0.639, # 需调参 
    'colsample_bytree':0.2, # 需调参 
    'objective':'binary:logistic'
}
Tuning(cv_params, other_params)  # 一个调节完毕之后，相应改动other_params 中的参数设置即可

一个参数调节完毕后，相应改变other_params中的最优参数设置即可
代码中打#的都可以做类似过程进行调参

3. 一个问题

GridSearchCV调参好慢啊好慢，不知道有什么好的方法能快一点，待补充！