科大讯飞:电信客户流失预测挑战赛baseline

文章目录

*
一、查看各字段中分布情况

+ 1.2 使用pandas_profiling自动分析数据
+ 二、 使用baseline参数训练
三、Null Importances进行特征选择

+ 3.2 计算Score
+ 3.3 筛选正确的特征
四、跑通baseline

+ 4.1使用lgb训练
+ 4.2 使用Xgb训练
+ 4.3 使用cat训练
+ 4.4 另外去掉’平均丢弃数据呼叫数’特征
五、贝叶斯调参

from google.colab import drive
drive.mount('/content/drive')
import os
os.chdir('/content/drive/MyDrive/chinese task/讯飞-电信用户流失')
Mounted at /content/drive

参考:

!pip install unzip
!unzip '/content/drive/MyDrive/chinese task/讯飞-电信用户流失/电信客户流失预测挑战赛数据集.zip'n

读取数据集:

import pandas as pd
train= pd.read_csv('./train.csv');
test=pd.read_csv('./test.csv')
train

客户ID地理区域是否双频是否翻新机当前手机价格手机网络功能婚姻状况家庭成人人数信息库匹配预计收入…客户生命周期内平均月费用客户生命周期内的平均每月使用分钟数客户整个生命周期内的平均每月通话次数过去三个月的平均每月使用分钟数过去三个月的平均每月通话次数过去三个月的平均月费用过去六个月的平均每月使用分钟数过去六个月的平均每月通话次数过去六个月的平均月费用是否流失0070-118102003…242869135112123303101250111310139903000…444471904831994048820244122141092702406…48183792719571209775403310023203-11-1…4230316647322672446219651440-1069901203…361192488153510621371…………………………………………………………1499951499951010135003000…156474160239807434612283114999614999661054203-11-1…5296820811582575813072615701499971499971510130001206…3950420554420345531205471149998149998121013990410-1…91685249233140944322369711499991499991010104903-11-1…3717780147593516774340

150000 rows × 69 columns

一、查看各字段中分布情况

train['是否流失'].value_counts()

missing_counts = pd.DataFrame(train.isnull().sum())
missing_counts.columns = ['count_null']
missing_counts.describe()

for col in train.columns:
  print(f'{col} \t {train.dtypes[col]}{train[col].nunique()}')
import pandas as pd
import numpy as np

from sklearn.metrics import roc_auc_score
from sklearn.metrics import accuracy_score
from sklearn.model_selection import KFold
import time
from lightgbm import LGBMClassifier
import lightgbm as lgb

import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
import seaborn as sns
%matplotlib inline

import warnings
warnings.simplefilter('ignore', UserWarning)

import gc
gc.enable()
import time

1.2 使用pandas_profiling自动分析数据

参考:

conda install -c conda-forge pandas-profiling

import pandas as pd
import pandas_profiling
data = pd.read_csv('./train.csv')
profile = data.profile_report(title='Pandas Profiling Report')
profile.to_file(output_file="telecom_customers_pandas_profiling.html")

查看Pandas Profiling Report发现:

  • 类别特征:’地理区域’,’是否双频’,’是否翻新机’,’手机网络功能’,’婚姻状况’,’家庭成人人数’,’信息库匹配’,’信用卡指示器’,’新手机用户’,’账户消费限额’
  • 分箱特征有:’预计收入’,
  • 异常值特征:’家庭中唯一订阅者的数量’,’家庭活跃用户数’,
  • 无用(数据不平衡)特征:’平均呼叫转移呼叫数’,’平均丢弃数据呼叫数’,[149797,148912]

features=list(train.columns)
categorical_features =['地理区域','是否双频','是否翻新机','手机网络功能','婚姻状况','预计收入',
                       '家庭成人人数','信息库匹配','信用卡指示器','新手机用户','账户消费限额']
numeric_features =[item for item in features if item not in categorical_features]
numeric_features=[i for i in numeric_features if i not in ['客户ID','是否流失']]

categorical_features1 =['是否双频','是否翻新机','手机网络功能','信息库匹配','信用卡指示器','新手机用户','账户消费限额']
categorical_features2 =['地理区域','婚姻状况','预计收入','家庭成人人数']

cols=['家庭中唯一订阅者的数量','家庭活跃用户数','数据超载的平均费用','平均漫游呼叫数','平均丢弃数据呼叫数','平均占线数据调用次数',
     '未应答数据呼叫的平均次数','尝试数据调用的平均数','完成数据调用的平均数','平均三通电话数','平均峰值数据调用次数',
      '非高峰数据呼叫的平均数量','平均呼叫转移呼叫数']
for i in cols:
    print(train[i].value_counts())
  • lr=0.2时roc=0.84479;0.3时0.8379,;lr=0.15时0.84578
  • ‘num_leaves’,30改为45时,0.8468

这样调没啥用啊


null_clos=['平均呼叫转移呼叫数','平均占线数据调用次数','未应答数据呼叫的平均次数','平均丢弃数据呼叫数']

for i in null_clos:
    del train[i]
    del test[i]
train

二、 使用baseline参数训练

《科大讯飞:电信客户流失预测挑战赛baseline》

  1. 全部特征跑10931轮,valid_acc=0.84298
  2. null importance跑5000轮:
  3. 选取split_feats大于0的特征(43种)可跑14402轮,valid_acc=0.83887
  4. 选取feats大于0的特征(23种)可跑10946轮,valid_acc=0.8193
  5. null importance跑1000轮:
  6. 选取split_feats大于0的特征(66种)可跑11817轮,valid_acc=0.84417
  7. 选取feats大于0的特征(58种)可跑11725轮,valid_acc=0.84345
from sklearn.model_selection import train_test_split

X_train,X_test,y_train,y_test=train_test_split(train.drop(labels=['客户ID','是否流失'],axis=1),train['是否流失'],random_state=10,test_size=0.2)
imp_df = pd.DataFrame()
lgb_train = lgb.Dataset(X_train, y_train,free_raw_data=False,silent=True)
lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train,free_raw_data=False,
                       silent=True)

lgb_params = {
      'boosting_type': 'gbdt',
      'objective': 'binary',
      'metric': 'auc',
      'min_child_weight': 5,
      'num_leaves': 2 ** 5,
      'lambda_l2': 10,
      'feature_fraction': 0.7,
      'bagging_fraction': 0.7,
      'bagging_freq': 10,
      'learning_rate': 0.2,
      'seed': 2022,
      'n_jobs':-1}

clf = lgb.train(params=lgb_params,train_set=lgb_train,valid_sets=lgb_eval,
          num_boost_round=50000,verbose_eval=300,early_stopping_rounds=200)
roc= roc_auc_score(y_test, clf.predict( X_test))
y_pred=[1 if x >0.5 else 0 for x in clf.predict(X_test)]
acc=accuracy_score(y_test,y_pred)
Training until validation scores don't improve for 200 rounds.

[300]   valid_0's auc: 0.733101
[600]   valid_0's auc: 0.754127
[900]   valid_0's auc: 0.766728
[1200]  valid_0's auc: 0.777367
[1500]  valid_0's auc: 0.78594
[1800]  valid_0's auc: 0.792209
[2100]  valid_0's auc: 0.798424
[2400]  valid_0's auc: 0.80417
[2700]  valid_0's auc: 0.808074
[3000]  valid_0's auc: 0.811665
[3300]  valid_0's auc: 0.814679
[3600]  valid_0's auc: 0.817462
[3900]  valid_0's auc: 0.820151
[4200]  valid_0's auc: 0.822135
[4500]  valid_0's auc: 0.824544
[4800]  valid_0's auc: 0.825994
Did not meet early stopping. Best iteration is:
[4994]  valid_0's auc: 0.826797
roc,acc
(0.8267972007033084, 0.7533)

三、Null Importances进行特征选择

def get_feature_importances(X_train, X_test, y_train, y_test,shuffle, seed=None):

    train_features = list(X_train.columns)

    y_train,y_test= y_train.copy(),y_test.copy()
    if shuffle:

        y_train,y_test= y_train.copy().sample(frac=1.0),y_test.copy().sample(frac=1.0)

    lgb_train = lgb.Dataset(X_train, y_train,free_raw_data=False,silent=True)
    lgb_eval = lgb.Dataset(X_test, y_test, reference=lgb_train,free_raw_data=False,silent=True)

    lgb_params = {
      'boosting_type': 'gbdt',
      'objective': 'binary',
      'metric': 'auc',
      'min_child_weight': 5,
      'num_leaves': 2 ** 5,
      'lambda_l2': 10,
      'feature_fraction': 0.7,
      'bagging_fraction': 0.7,
      'bagging_freq': 10,
      'learning_rate': 0.2,
      'seed': 2022,
      'n_jobs':-1}

    clf = lgb.train(params=lgb_params,train_set=lgb_train,valid_sets=lgb_eval,
          num_boost_round=500,verbose_eval=50,early_stopping_rounds=30)

    imp_df = pd.DataFrame()
    imp_df["feature"] = list(train_features)
    imp_df["importance_gain"] = clf.feature_importance(importance_type='gain')
    imp_df["importance_split"] = clf.feature_importance(importance_type='split')
    imp_df['trn_score'] = roc_auc_score(y_test, clf.predict( X_test))

    return imp_df

np.random.seed(123)

actual_imp_df = get_feature_importances(X_train, X_test, y_train, y_test, shuffle=False)
actual_imp_df
Training until validation scores don't improve for 20 rounds.

[30]    valid_0's auc: 0.695549
[60]    valid_0's auc: 0.704629
[90]    valid_0's auc: 0.711638
[120]   valid_0's auc: 0.715182
[150]   valid_0's auc: 0.718961
[180]   valid_0's auc: 0.722121
[210]   valid_0's auc: 0.725615
[240]   valid_0's auc: 0.728251
[270]   valid_0's auc: 0.730962
[300]   valid_0's auc: 0.733101
[330]   valid_0's auc: 0.73578
[360]   valid_0's auc: 0.73886
[390]   valid_0's auc: 0.741238
[420]   valid_0's auc: 0.742486
[450]   valid_0's auc: 0.744295
[480]   valid_0's auc: 0.746555
Did not meet early stopping. Best iteration is:
[495]   valid_0's auc: 0.747792

featureimportance_gainimportance_splittrn_score0地理区域1956.6004223130.7477921是否双频442.401141620.7477922是否翻新机269.466828260.7477923当前手机价格3838.6961973650.7477924手机网络功能750.396258510.747792……………62过去三个月的平均每月通话次数2540.7210273250.74779263过去三个月的平均月费用2098.8138673040.74779264过去六个月的平均每月使用分钟数2375.9257413370.74779265过去六个月的平均每月通话次数2541.7351723460.74779266过去六个月的平均月费用2103.0622073130.747792

67 rows × 4 columns

null_imp_df = pd.DataFrame()
nb_runs = 10
import time
start = time.time()
dsp = ''
for i in range(nb_runs):

    imp_df = get_feature_importances(X_train, X_test, y_train, y_test, shuffle=True)
    imp_df['run'] = i + 1

    null_imp_df = pd.concat([null_imp_df, imp_df], axis=0)

    for l in range(len(dsp)):
        print('\b', end='', flush=True)

    spent = (time.time() - start) / 60
    dsp = 'Done with %4d of %4d (Spent %5.1f min)' % (i + 1, nb_runs, spent)
    print(dsp, end='', flush=True)
null_imp_df

featureimportance_gainimportance_splittrn_scorerun0地理区域38.62273050.50532011是否双频0.00000000.50532012是否翻新机0.00000000.50532013当前手机价格30.98030040.50532014手机网络功能0.00000000.5053201………………62过去三个月的平均每月通话次数109.945481140.5039111063过去三个月的平均月费用35.34462140.5039111064过去六个月的平均每月使用分钟数55.20038070.5039111065过去六个月的平均每月通话次数53.43908060.5039111066过去六个月的平均月费用47.45520060.50391110

670 rows × 5 columns

def display_distributions(actual_imp_df_, null_imp_df_, feature_):
    plt.figure(figsize=(13, 6))
    gs = gridspec.GridSpec(1, 2)

    ax = plt.subplot(gs[0, 0])
    a = ax.hist(null_imp_df_.loc[null_imp_df_['feature'] == feature_, 'importance_split'].values, label='Null importances')
    ax.vlines(x=actual_imp_df_.loc[actual_imp_df_['feature'] == feature_, 'importance_split'].mean(),
               ymin=0, ymax=np.max(a[0]), color='r',linewidth=10, label='Real Target')
    ax.legend()
    ax.set_title('Split Importance of %s' % feature_.upper(), fontweight='bold')
    plt.xlabel('Null Importance (split) Distribution for %s ' % feature_.upper())

    ax = plt.subplot(gs[0, 1])
    a = ax.hist(null_imp_df_.loc[null_imp_df_['feature'] == feature_, 'importance_gain'].values, label='Null importances')
    ax.vlines(x=actual_imp_df_.loc[actual_imp_df_['feature'] == feature_, 'importance_gain'].mean(),
               ymin=0, ymax=np.max(a[0]), color='r',linewidth=10, label='Real Target')
    ax.legend()
    ax.set_title('Gain Importance of %s' % feature_.upper(), fontweight='bold')
    plt.xlabel('Null Importance (gain) Distribution for %s ' % feature_.upper())

display_distributions(actual_imp_df_=actual_imp_df, null_imp_df_=null_imp_df, feature_='DESTINATION_AIRPORT')

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-KNaPI0o5-1655391260403)(xunfei_files/xunfei_16_0.png)]

plt.rcParams['font.sans-serif'] = ['SimHei']
plt.rcParams['axes.unicode_minus'] = False
sns.set(font='SimHei')

3.2 计算Score

以未进行特征shuffle的特征重要性除以shuffle之后的0.75分位数作为我们的score


feature_scores = []
for _f in actual_imp_df['feature'].unique():
    f_null_imps_gain = null_imp_df.loc[null_imp_df['feature'] == _f, 'importance_gain'].values
    f_act_imps_gain = actual_imp_df.loc[actual_imp_df['feature'] == _f, 'importance_gain'].mean()
    gain_score = np.log(1e-10 + f_act_imps_gain / (1 + np.percentile(f_null_imps_gain, 75)))
    f_null_imps_split = null_imp_df.loc[null_imp_df['feature'] == _f, 'importance_split'].values
    f_act_imps_split = actual_imp_df.loc[actual_imp_df['feature'] == _f, 'importance_split'].mean()
    split_score = np.log(1e-10 + f_act_imps_split / (1 + np.percentile(f_null_imps_split, 75)))
    feature_scores.append((_f, split_score, gain_score))

scores_df = pd.DataFrame(feature_scores, columns=['feature', 'split_score', 'gain_score'])

plt.figure(figsize=(16, 16))
gs = gridspec.GridSpec(1, 2)

ax = plt.subplot(gs[0, 0])
sns.barplot(x='split_score', y='feature', data=scores_df.sort_values('split_score', ascending=False).iloc[0:70], ax=ax)
ax.set_title('Feature scores wrt split importances', fontweight='bold', fontsize=14)

ax = plt.subplot(gs[0, 1])
sns.barplot(x='gain_score', y='feature', data=scores_df.sort_values('gain_score', ascending=False).iloc[0:70], ax=ax)
ax.set_title('Feature scores wrt gain importances', fontweight='bold', fontsize=14)
plt.tight_layout()

null_imp_df.to_csv('null_importances_distribution_rf.csv')
actual_imp_df.to_csv('actual_importances_ditribution_rf.csv')

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-ObncLd9v-1655391260406)(xunfei_files/xunfei_19_1.png)]

[('当前设备使用天数', 21885.414773210883), ('当月使用分钟数与前三个月平均值的百分比变化', 17307.072956457734), ('每月平均使用分钟数', 12217.853455409408), ('在职总月数', 11940.929380342364), ('客户生命周期内的平均每月使用分钟数', 11776.946275830269), ('客户整个生命周期内的平均每月通话次数', 11571.01933504641), ('已完成语音通话的平均使用分钟数', 10899.402202293277), ('客户生命周期内的总费用', 10882.543393820524), ('当前手机价格', 10766.242197856307), ('使用高峰语音通话的平均不完整分钟数', 10392.122741535306), ('计费调整后的总费用', 10233.600193202496), ('当月费用与前三个月平均值的百分比变化', 10154.000930830836), ('客户生命周期内的总使用分钟数', 9959.518506526947), ('计费调整后的总分钟数', 9880.493449807167), ('客户生命周期内平均月费用', 9879.557141974568), ('客户生命周期内的总通话次数', 9863.276128590107), ('过去六个月的平均每月使用分钟数', 9739.2590110749), ('过去六个月的平均每月通话次数', 9574.12247480452), ('过去三个月的平均每月使用分钟数', 9345.73676533997), ('计费调整后的呼叫总数', 9230.227682426572)]

scores_df.sort_values(by="split_score",ascending=False,inplace=True)
scores_df

featuresplit_scoregain_score17每月平均使用分钟数4.1523974.57127960客户整个生命周期内的平均每月通话次数4.1163234.22602156计费调整后的总分钟数3.9928083.96158552客户生命周期内的总通话次数3.9325024.00805938一分钟内的平均呼入电话数3.8322583.505356…………35完成数据调用的平均数1.8787712.22074630未应答数据呼叫的平均次数1.7917593.04040028平均占线数据调用次数1.6094382.56571149平均呼叫转移呼叫数0.6931472.22164026平均丢弃数据呼叫数-23.025851-23.025851

67 rows × 3 columns

  <script>
    const buttonEl =
      document.querySelector('#df-f361a60a-7ab8-44ef-b53e-41a69f129e6a button.colab-df-convert');
    buttonEl.style.display =
      google.colab.kernel.accessAllowed ? 'block' : 'none';

    async function convertToInteractive(key) {
      const element = document.querySelector('#df-f361a60a-7ab8-44ef-b53e-41a69f129e6a');
      const dataTable =
        await google.colab.kernel.invokeFunction('convertToInteractive',
                                                 [key], {});
      if (!dataTable) return;

      const docLinkHtml = 'Like what you see? Visit the ' +
        '<a target="_blank" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'
        + ' to learn more about interactive tables.';
      element.innerHTML = '';
      dataTable['output_type'] = 'display_data';
      await google.colab.output.renderOutput(dataTable, element);
      const docLink = document.createElement('div');
      docLink.innerHTML = docLinkHtml;
      element.appendChild(docLink);
    }
  </script>


correlation_scores = []
for _f in actual_imp_df['feature'].unique():
    f_null_imps = null_imp_df.loc[null_imp_df['feature'] == _f, 'importance_gain'].values
    f_act_imps = actual_imp_df.loc[actual_imp_df['feature'] == _f, 'importance_gain'].values
    gain_score = 100 * (f_null_imps < np.percentile(f_act_imps, 35)).sum() / f_null_imps.size
    f_null_imps = null_imp_df.loc[null_imp_df['feature'] == _f, 'importance_split'].values
    f_act_imps = actual_imp_df.loc[actual_imp_df['feature'] == _f, 'importance_split'].values
    split_score = 100 * (f_null_imps < np.percentile(f_act_imps, 35)).sum() / f_null_imps.size
    correlation_scores.append((_f, split_score, gain_score))

corr_scores_df = pd.DataFrame(correlation_scores, columns=['feature', 'split_score', 'gain_score'])

fig = plt.figure(figsize=(16, 16))
gs = gridspec.GridSpec(1, 2)

ax = plt.subplot(gs[0, 0])
sns.barplot(x='split_score', y='feature', data=corr_scores_df.sort_values('split_score', ascending=False).iloc[0:70], ax=ax)
ax.set_title('Feature scores wrt split importances', fontweight='bold', fontsize=14)

ax = plt.subplot(gs[0, 1])
sns.barplot(x='gain_score', y='feature', data=corr_scores_df.sort_values('gain_score', ascending=False).iloc[0:70], ax=ax)
ax.set_title('Feature scores wrt gain importances', fontweight='bold', fontsize=14)
plt.tight_layout()
plt.suptitle("Features' split and gain scores", fontweight='bold', fontsize=16)
fig.subplots_adjust(top=0.93)

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-c2T6Pes1-1655391260409)(xunfei_files/xunfei_22_1.png)]

corr_scores_df.sort_values(by="split_score",ascending=False,inplace=True)
corr_scores_df

featuresplit_scoregain_score0地理区域100.0100.050平均呼叫等待呼叫数100.0100.036平均客户服务电话次数100.0100.037使用客户服务电话的平均分钟数100.0100.038一分钟内的平均呼入电话数100.0100.0…………29平均未接语音呼叫数100.0100.030未应答数据呼叫的平均次数100.0100.031尝试拨打的平均语音呼叫次数100.0100.066过去六个月的平均月费用100.0100.026平均丢弃数据呼叫数0.00.0

67 rows × 3 columns

3.3 筛选正确的特征

通过corr_scores_df知道,平均丢弃数据呼叫数是没用的,可以去掉。去掉之后效果确实提升了

X_train,X_test,y_train,y_test=train_test_split(train.drop(labels=
      ['客户ID','是否流失','平均丢弃数据呼叫数'],axis=1),train['是否流失'],random_state=10,test_size=0.2)

imp_df = pd.DataFrame()
lgb_train = lgb.Dataset(X_train,y_train,free_raw_data=False,silent=True)
lgb_eval = lgb.Dataset(X_test,y_test,reference=lgb_train,free_raw_data=False,
                       silent=True)

lgb_params = {
      'boosting_type': 'gbdt',
      'objective': 'binary',
      'metric': 'auc',
      'min_child_weight': 5,
      'num_leaves': 2 ** 5,
      'lambda_l2': 10,
      'feature_fraction': 0.7,
      'bagging_fraction': 0.7,
      'bagging_freq': 10,
      'learning_rate': 0.2,
      'seed': 2022,
      'n_jobs':-1}

clf = lgb.train(params=lgb_params,train_set=lgb_train,valid_sets=lgb_eval,
          num_boost_round=50000,verbose_eval=300,early_stopping_rounds=200)
roc= roc_auc_score(y_test, clf.predict( X_test))
y_pred=[1 if x >0.5 else 0 for x in clf.predict(X_test)]
acc=accuracy_score(y_test,y_pred)
Training until validation scores don't improve for 200 rounds.

[300]   valid_0's auc: 0.734833
[600]   valid_0's auc: 0.753598
[900]   valid_0's auc: 0.767934
[1200]  valid_0's auc: 0.778701
[1500]  valid_0's auc: 0.785552
[1800]  valid_0's auc: 0.793379
[2100]  valid_0's auc: 0.799713
[2400]  valid_0's auc: 0.805404
[2700]  valid_0's auc: 0.809381
[3000]  valid_0's auc: 0.813516
[3300]  valid_0's auc: 0.816289
[3600]  valid_0's auc: 0.81927
[3900]  valid_0's auc: 0.821682
[4200]  valid_0's auc: 0.824342
[4500]  valid_0's auc: 0.82676
[4800]  valid_0's auc: 0.829004
[5100]  valid_0's auc: 0.830592
[5400]  valid_0's auc: 0.83205
[5700]  valid_0's auc: 0.833626
[6000]  valid_0's auc: 0.83478
[6300]  valid_0's auc: 0.835981
[6600]  valid_0's auc: 0.836975
[6900]  valid_0's auc: 0.837994
[7200]  valid_0's auc: 0.838715
[7500]  valid_0's auc: 0.83963
[7800]  valid_0's auc: 0.840372
[8100]  valid_0's auc: 0.840644
[8400]  valid_0's auc: 0.841068
[8700]  valid_0's auc: 0.841685
Early stopping, best iteration is:
[8768]  valid_0's auc: 0.841806
pred=clf.predict(X_test,num_iteration=clf.best_iteration)
roc,acc
(0.8418064634121478, 0.7683)

四、跑通baseline

baseline参考:https://mp.weixin.qq.com/s/nLgaGMJByOqRVWnm1UfB3g

!pip install catboost
import pandas as pd
import os
import gc
import lightgbm as lgb
import xgboost as xgb
from catboost import CatBoostRegressor
from sklearn.linear_model import SGDRegressor, LinearRegression, Ridge
from sklearn.preprocessing import MinMaxScaler
from gensim.models import Word2Vec
import math
import numpy as np
from tqdm import tqdm
from sklearn.model_selection import StratifiedKFold, KFold
from sklearn.metrics import accuracy_score, f1_score, roc_auc_score, log_loss
import matplotlib.pyplot as plt
import time
import warnings
warnings.filterwarnings('ignore')
train = pd.read_csv('train.csv')
test = pd.read_csv('test.csv')
data = pd.concat([train, test], axis=0, ignore_index=True)

features = [f for f in data.columns if f not in ['是否流失','客户ID','平均丢弃数据呼叫数']]

train = data[data['是否流失'].notnull()].reset_index(drop=True)
test = data[data['是否流失'].isnull()].reset_index(drop=True)

x_train = train[features]
x_test = test[features]

y_train = train['是否流失']

4.1使用lgb训练

def cv_model(clf, train_x, train_y, test_x, clf_name):
    folds=5
    seed=2022
    kf=KFold(n_splits=folds,shuffle=True,random_state=seed)

    train=np.zeros(train_x.shape[0])
    test=np.zeros(test_x.shape[0])

    cv_scores = []

    for i, (train_index, valid_index) in enumerate(kf.split(train_x,train_y)):
        print('************************************ {} ************************************'.format(str(i+1)))
        trn_x,trn_y,val_x,val_y=train_x.iloc[train_index],train_y[train_index],train_x.iloc[valid_index],train_y[valid_index]

        if clf_name == "lgb":
            train_matrix=clf.Dataset(trn_x, label=trn_y)
            valid_matrix=clf.Dataset(val_x, label=val_y)

            params = {
                'boosting_type': 'gbdt',
                'objective': 'binary',
                'metric': 'auc',
                'num_leaves': 2 ** 5,
                'lambda_l2': 10,
                'feature_fraction': 0.7,
                'bagging_fraction': 0.7,
                'bagging_freq': 10,
                'learning_rate': 0.2,
                'seed': 2022,
                'n_jobs':-1}

            params2={'boosting_type': 'gbdt',
                'objective': 'binary',
                'metric': 'auc',
                'bagging_fraction': 0.8864320989515848,
                'bagging_freq': 10,
                'feature_fraction': 0.7719195132945438,
                'lambda_l1': 4.0642058550131175,
                'lambda_l2': 0.7571744617226672,
                'learning_rate': 0.33853400726057015,
                'max_depth': 10,
                'min_gain_to_split': 0.47988339149638315,
                'num_leaves': 48,
                'seed': 2022,
                'n_jobs':-1}
            model = clf.train(params,train_matrix,50000,valid_sets=[train_matrix, valid_matrix],
                              categorical_feature=[],verbose_eval=3000, early_stopping_rounds=200)
            val_pred=model.predict(val_x,num_iteration=model.best_iteration)
            test_pred=model.predict(test_x,num_iteration=model.best_iteration)

            print(list(sorted(zip(features,model.feature_importance("gain")),key=lambda x: x[1], reverse=True))[:20])

        if clf_name == "xgb":
            train_matrix=clf.DMatrix(trn_x,label=trn_y)
            valid_matrix=clf.DMatrix(val_x,label=val_y)
            test_matrix=clf.DMatrix(test_x)

            params={'booster': 'gbtree',
                      'objective': 'binary:logistic',
                      'eval_metric': 'auc',
                      'gamma': 1,
                      'min_child_weight': 1.5,
                      'max_depth': 5,
                      'lambda': 10,
                      'subsample': 0.7,
                      'colsample_bytree': 0.7,
                      'colsample_bylevel': 0.7,
                      'eta': 0.2,
                      'tree_method': 'exact',
                      'seed': 2020,
                      'nthread': 36,
                      "silent": True,
                      }

            watchlist=[(train_matrix, 'train'),(valid_matrix, 'eval')]

            model=clf.train(params, train_matrix, num_boost_round=50000, evals=watchlist, verbose_eval=3000, early_stopping_rounds=200)
            val_pred=model.predict(valid_matrix, ntree_limit=model.best_ntree_limit)
            test_pred=model.predict(test_matrix , ntree_limit=model.best_ntree_limit)

        if clf_name=="cat":
            params={'learning_rate': 0.2, 'depth': 5, 'l2_leaf_reg': 10, 'bootstrap_type': 'Bernoulli',
                      'od_type': 'Iter', 'od_wait': 50, 'random_seed': 11, 'allow_writing_files': False}

            model=clf(iterations=20000, **params)
            model.fit(trn_x,trn_y,eval_set=(val_x, val_y),
                      cat_features=[],use_best_model=True, verbose=3000)

            val_pred=model.predict(val_x)
            test_pred=model.predict(test_x)

        train[valid_index]=val_pred
        test=test_pred/kf.n_splits
        cv_scores.append(roc_auc_score(val_y,val_pred))

        print(cv_scores)

    print("%s_scotrainre_list:" % clf_name, cv_scores)
    print("%s_score_mean:" % clf_name, np.mean(cv_scores))
    print("%s_score_std:" % clf_name, np.std(cv_scores))
    return train, test

def lgb_model(x_train,y_train,x_test):
    lgb_train,lgb_test=cv_model(lgb,x_train,y_train,x_test,"lgb")
    return lgb_train,lgb_test

def xgb_model(x_train,y_train,x_test):
    xgb_train,xgb_test=cv_model(xgb,x_train,y_train,x_test,"xgb")
    return xgb_train, xgb_test

def cat_model(x_train,y_train,x_test):
    cat_train,cat_test=cv_model(CatBoostRegressor,x_train,y_train,x_test,"cat")
    return cat_train,cat_test
lgb_train,lgb_test=lgb_model(x_train,y_train,x_test)
test['是否流失'] = lgb_test
test[['客户ID','是否流失']].to_csv('lgb_base.csv',index=False)
************************************ 1 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999488    valid_1's auc: 0.811334
Early stopping, best iteration is:
[5163]  training's auc: 0.999996    valid_1's auc: 0.8289
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21934.934124737978), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17126.358324214816), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12409.957622632384), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12073.125538095832), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11994.06405813992), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11518.068050682545), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11292.594955265522), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10964.187494635582), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10750.710047110915), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10274.193908914924), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10260.600554332137), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10164.166730254889), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 10095.02776375413), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10074.029564589262), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9900.794713005424), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9874.11763061583), ('&#x5E73;&#x5747;&#x975E;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x547C;&#x53EB;&#x6570;', 9546.732098400593), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9531.47578701377), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9481.577100589871), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9305.693744853139)]
[0.8288996222651557]
************************************ 2 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999472    valid_1's auc: 0.811878
Early stopping, best iteration is:
[4772]  training's auc: 0.999971    valid_1's auc: 0.827608
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21505.16284123063), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 16946.651323199272), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12132.766281962395), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 11971.832910627127), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11526.178689315915), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11283.326876536012), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 11003.212880536914), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10808.01029574871), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10684.196997240186), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10399.707967177033), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10358.123901829123), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10162.593608289957), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 10073.619953781366), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 9978.180806919932), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9764.853373721242), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9391.67290854454), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9381.156281203032), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9243.235832542181), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9032.57935705781), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 8945.249050289392)]
[0.8288996222651557, 0.8276084395403329]
************************************ 3 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999494    valid_1's auc: 0.811642
Early stopping, best iteration is:
[4663]  training's auc: 0.99999 valid_1's auc: 0.827114
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21289.608253866434), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 16997.806541010737), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12316.054881855845), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11741.117707148194), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11664.033028051257), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11115.561656951904), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 10854.345216721296), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10763.63857871294), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10621.98585870862), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10375.685174629092), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10232.226524055004), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10052.964914098382), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 9799.514198839664), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9735.032970786095), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9637.621711835265), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9429.328524649143), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9333.910300150514), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9013.730677694082), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 8954.436415627599), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 8829.167943418026)]
[0.8288996222651557, 0.8276084395403329, 0.8271140081312421]
************************************ 4 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999532    valid_1's auc: 0.814214
Early stopping, best iteration is:
[5281]  training's auc: 0.999996    valid_1's auc: 0.830897
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21271.166813850403), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17270.63153974712), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12677.148315995932), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12486.456961512566), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11930.549542114139), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11403.163509890437), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11126.607083335519), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10973.327338501811), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10719.836767598987), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10684.931542679667), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10567.041279122233), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 10477.076363384724), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10404.941493198276), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 10015.077973127365), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 9988.746752500534), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9924.928602397442), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9658.558003604412), ('&#x5E73;&#x5747;&#x975E;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x547C;&#x53EB;&#x6570;', 9605.689363330603), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9560.14350926876), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9525.798342213035)]
[0.8288996222651557, 0.8276084395403329, 0.8271140081312421, 0.8308971625977979]
************************************ 5 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999444    valid_1's auc: 0.8118
Early stopping, best iteration is:
[5148]  training's auc: 0.999994    valid_1's auc: 0.829686
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21662.356478646398), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17710.528580009937), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12402.68640038371), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11945.518620952964), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11887.39459644258), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11309.949122816324), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11231.172733142972), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10822.351191923022), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10691.375393077731), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10513.226110234857), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10418.488398104906), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10276.142720848322), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10242.566086634994), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 10193.664465650916), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 10117.483586207032), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9943.684495016932), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9800.775234118104), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9572.030710801482), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9561.15305377543), ('&#x5E73;&#x5747;&#x975E;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x547C;&#x53EB;&#x6570;', 9292.315245553851)]
[0.8288996222651557, 0.8276084395403329, 0.8271140081312421, 0.8308971625977979, 0.8296855557324957]
lgb_scotrainre_list: [0.8288996222651557, 0.8276084395403329, 0.8271140081312421, 0.8308971625977979, 0.8296855557324957]
lgb_score_mean: 0.8288409576534048
lgb_score_std: 0.0013744978556818929

"\n************************************ 1 ************************************\nTraining until validation scores don't improve for 200 rounds.\n[3000]\ttraining's auc: 0.999474\tvalid_1's auc: 0.811874\nEarly stopping, best iteration is:\n[4935]\ttraining's auc: 0.999996\tvalid_1's auc: 0.827972\n[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21885.414773210883), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17307.072956457734), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12217.853455409408), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 11940.929380342364), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11776.946275830269), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11571.01933504641), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10899.402202293277), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10882.543393820524), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10766.242197856307), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10392.122741535306), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10233.600193202496), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10154.000930830836), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9959.518506526947), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9880.493449807167), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9879.557141974568), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9863.276128590107), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9739.2590110749), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9574.12247480452), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9345.73676533997), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9230.227682426572)]\n[0.8279715963308298]\n************************************ 2 ************************************\nTraining until validation scores don't improve for 200 rounds.\n[3000]\ttraining's auc: 0.999427\tvalid_1's auc: 0.810338\nEarly stopping, best iteration is:\n[4648]\ttraining's auc: 0.999965\tvalid_1's auc: 0.824151\n[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21631.878849938512), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 16730.961754366755), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12067.951921060681), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12002.064660459757), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11514.234459266067), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11378.85348239541), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10749.901214078069), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10722.060040861368), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10603.264658093452), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10405.526055783033), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10171.211520016193), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10006.355669140816), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9942.827439278364), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9937.020643949509), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 9920.474541395903), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9621.407806247473), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9319.960188627243), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9318.490131109953), ('&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9294.081347599626), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9203.844007015228)]\n[0.8279715963308298, 0.8241509252411403]\n************************************ 3 ************************************\nTraining until validation scores don't improve for 200 rounds.\n[3000]\ttraining's auc: 0.99949\tvalid_1's auc: 0.810687\nEarly stopping, best iteration is:\n[4731]\ttraining's auc: 0.999987\tvalid_1's auc: 0.825545\n[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21968.028517633677), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 16903.005848184228), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12133.818779706955), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11976.253827899694), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11948.46197539568), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11421.855388239026), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11262.173004433513), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 11005.929363071918), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10528.124375209212), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10390.872772306204), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10347.706698387861), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10124.151285156608), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9813.354337349534), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 9805.469536915421), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9772.446165367961), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9544.928655743599), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9390.860902503133), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9323.151294022799), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9320.212619245052), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9084.183073118329)]\n[0.8279715963308298, 0.8241509252411403, 0.8255446840361296]\n************************************ 4 ************************************\nTraining until validation scores don't improve for 200 rounds.\n[3000]\ttraining's auc: 0.9995\tvalid_1's auc: 0.813234\nEarly stopping, best iteration is:\n[5599]\ttraining's auc: 0.999997\tvalid_1's auc: 0.831782\n[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21882.617314189672), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17574.675364792347), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12675.68729557097), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12567.960791677237), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12466.111717522144), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11556.674870744348), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11522.147867411375), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 11065.775812849402), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10911.875026881695), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10715.607445791364), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10510.982212975621), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10451.965088263154), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10446.603226020932), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 10408.396666422486), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10079.377708375454), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 10037.817246481776), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 10017.892398029566), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9739.093963235617), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9609.546253487468), ('&#x5E73;&#x5747;&#x975E;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x547C;&#x53EB;&#x6570;', 9569.536746695638)]\n[0.8279715963308298, 0.8241509252411403, 0.8255446840361296, 0.8317817862344651]\n************************************ 5 ************************************\nTraining until validation scores don't improve for 200 rounds.\n[3000]\ttraining's auc: 0.999448\tvalid_1's auc: 0.810144\nEarly stopping, best iteration is:\n[5255]\ttraining's auc: 0.999999\tvalid_1's auc: 0.829245\n[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21498.932903170586), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17680.002600044012), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12638.706078097224), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12569.80523788929), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12267.705941140652), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 11370.256973087788), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11110.097302675247), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11020.642103403807), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10986.333106696606), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10700.256485000253), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10575.144608184695), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 10401.467713326216), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10237.447989702225), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 10139.773517146707), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 10076.59566681087), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9953.696122318506), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9595.342250138521), ('&#x5E73;&#x5747;&#x975E;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x547C;&#x53EB;&#x6570;', 9504.704583987594), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9500.140991523862), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9425.357908219099)]\n[0.8279715963308298, 0.8241509252411403, 0.8255446840361296, 0.8317817862344651, 0.8292446287869105]\nlgb_scotrainre_list: [0.8279715963308298, 0.8241509252411403, 0.8255446840361296, 0.8317817862344651, 0.8292446287869105]\nlgb_score_mean: 0.827738724125895\nlgb_score_std: 0.002696458502502849\n"

4.2 使用Xgb训练

xgb_train,xgb_test=xgb_model(x_train,y_train,x_test)
test['是否流失'] = xgb_test
test[['客户ID','是否流失']].to_csv('xgb_base.csv',index=False)
************************************ 1 ************************************
[0] train-auc:0.635939  eval-auc:0.634176
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.

Will train until eval-auc hasn't improved in 200 rounds.

[3000]  train-auc:0.992932  eval-auc:0.788708
[6000]  train-auc:0.999906  eval-auc:0.807173
[9000]  train-auc:0.999997  eval-auc:0.812868
Stopping. Best iteration:
[9945]  train-auc:0.999999  eval-auc:0.814055

[0.8140550495535315]
************************************ 2 ************************************
[0] train-auc:0.636635  eval-auc:0.633894
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.

Will train until eval-auc hasn't improved in 200 rounds.

[3000]  train-auc:0.992878  eval-auc:0.790387
[6000]  train-auc:0.99988   eval-auc:0.807621
Stopping. Best iteration:
[8538]  train-auc:0.999991  eval-auc:0.812347

[0.8140550495535315, 0.8123468873894992]
************************************ 3 ************************************
[0] train-auc:0.637058  eval-auc:0.630979
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.

Will train until eval-auc hasn't improved in 200 rounds.

[3000]  train-auc:0.992874  eval-auc:0.790023
[6000]  train-auc:0.999898  eval-auc:0.80827
[9000]  train-auc:0.999996  eval-auc:0.813291
Stopping. Best iteration:
[8933]  train-auc:0.999996  eval-auc:0.813342

[0.8140550495535315, 0.8123468873894992, 0.8133415339513355]
************************************ 4 ************************************
[0] train-auc:0.635278  eval-auc:0.633351
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.

Will train until eval-auc hasn't improved in 200 rounds.

[3000]  train-auc:0.993107  eval-auc:0.78905
[6000]  train-auc:0.999903  eval-auc:0.808401
Stopping. Best iteration:
[8343]  train-auc:0.999993  eval-auc:0.812439

[0.8140550495535315, 0.8123468873894992, 0.8133415339513355, 0.8124389857259089]
************************************ 5 ************************************
[0] train-auc:0.635985  eval-auc:0.633911
Multiple eval metrics have been passed: 'eval-auc' will be used for early stopping.

Will train until eval-auc hasn't improved in 200 rounds.

[3000]  train-auc:0.992892  eval-auc:0.788101
[6000]  train-auc:0.999904  eval-auc:0.805732
[9000]  train-auc:0.999997  eval-auc:0.810194
Stopping. Best iteration:
[10041] train-auc:0.999999  eval-auc:0.811155

[0.8140550495535315, 0.8123468873894992, 0.8133415339513355, 0.8124389857259089, 0.8111551410360852]
xgb_scotrainre_list: [0.8140550495535315, 0.8123468873894992, 0.8133415339513355, 0.8124389857259089, 0.8111551410360852]
xgb_score_mean: 0.8126675195312721
xgb_score_std: 0.000982024071432044

4.3 使用cat训练

cat_train,cat_test=cat_model(x_train,y_train,x_test)
test['是否流失'] = cat_test
test[['客户ID','是否流失']].to_csv('cat_base.csv',index=False)
************************************ 1 ************************************
0:  learn: 0.4955489    test: 0.4954619 best: 0.4954619 (0) total: 233ms    remaining: 1h 17m 39s
3000:   learn: 0.3769726    test: 0.4483572 best: 0.4483572 (3000)  total: 2m 5s    remaining: 11m 50s
6000:   learn: 0.3209359    test: 0.4391546 best: 0.4391520 (5999)  total: 4m   remaining: 9m 21s
Stopped by overfitting detector  (50 iterations wait)

bestTest = 0.4360869428
bestIteration = 7499

Shrink model to first 7500 iterations.

[0.78868229695141]
************************************ 2 ************************************
0:  learn: 0.4953117    test: 0.4954092 best: 0.4954092 (0) total: 39.5ms   remaining: 13m 10s
3000:   learn: 0.3763302    test: 0.4490481 best: 0.4490378 (2981)  total: 1m 46s   remaining: 10m 2s
6000:   learn: 0.3196365    test: 0.4402621 best: 0.4402621 (6000)  total: 3m 38s   remaining: 8m 30s
Stopped by overfitting detector  (50 iterations wait)

bestTest = 0.4361341716
bestIteration = 8001

Shrink model to first 8002 iterations.

[0.78868229695141, 0.7897985044313038]
************************************ 3 ************************************
0:  learn: 0.4954711    test: 0.4955905 best: 0.4955905 (0) total: 38.5ms   remaining: 12m 49s
3000:   learn: 0.3763265    test: 0.4477431 best: 0.4477431 (3000)  total: 1m 49s   remaining: 10m 21s
Stopped by overfitting detector  (50 iterations wait)

bestTest = 0.4406746798
bestIteration = 5128

Shrink model to first 5129 iterations.

[0.78868229695141, 0.7897985044313038, 0.7788144016087264]
************************************ 4 ************************************
0:  learn: 0.4955798    test: 0.4955669 best: 0.4955669 (0) total: 46.1ms   remaining: 15m 21s
3000:   learn: 0.3768704    test: 0.4486424 best: 0.4486421 (2997)  total: 1m 45s   remaining: 9m 59s
Stopped by overfitting detector  (50 iterations wait)

bestTest = 0.4426386429
bestIteration = 4903

Shrink model to first 4904 iterations.

[0.78868229695141, 0.7897985044313038, 0.7788144016087264, 0.7744056829683829]
************************************ 5 ************************************
0:  learn: 0.4955262    test: 0.4956471 best: 0.4956471 (0) total: 38.9ms   remaining: 12m 57s
3000:   learn: 0.3761659    test: 0.4494234 best: 0.4494234 (3000)  total: 1m 47s   remaining: 10m 11s
6000:   learn: 0.3202277    test: 0.4407377 best: 0.4407330 (5999)  total: 3m 31s   remaining: 8m 12s
9000:   learn: 0.2781913    test: 0.4347233 best: 0.4347168 (8998)  total: 5m 14s   remaining: 6m 24s
Stopped by overfitting detector  (50 iterations wait)

bestTest = 0.4323322625
bestIteration = 10483

Shrink model to first 10484 iterations.

[0.78868229695141, 0.7897985044313038, 0.7788144016087264, 0.7744056829683829, 0.7982800693357867]
cat_scotrainre_list: [0.78868229695141, 0.7897985044313038, 0.7788144016087264, 0.7744056829683829, 0.7982800693357867]
cat_score_mean: 0.785996191059122
cat_score_std: 0.0084674009574612

4.4 另外去掉’平均丢弃数据呼叫数’特征

效果变差了

lgb_train,lgb_test=lgb_model(x_train,y_train,x_test)

lgb_train,lgb_test=lgb_model(x_train,y_train,x_test)
************************************ 1 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999535    valid_1's auc: 0.811083
Early stopping, best iteration is:
[5495]  training's auc: 0.999999    valid_1's auc: 0.830252
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21646.43981860578), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17622.58995847404), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12633.31053687632), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12317.316355511546), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 12213.196875602007), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11988.745236545801), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11742.254607230425), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10961.734202891588), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10739.284949079156), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10717.661178082228), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10648.361330911517), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10563.12071943283), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 10260.813065826893), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10214.983077257872), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 10042.090887442231), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10030.256944060326), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9833.17426289618), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9658.642087131739), ('&#x5E73;&#x5747;&#x975E;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x547C;&#x53EB;&#x6570;', 9604.195981651545), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9474.32663051784)]
[0.8302521387863329]
************************************ 2 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999484    valid_1's auc: 0.812252
Early stopping, best iteration is:
[4761]  training's auc: 0.999991    valid_1's auc: 0.827726
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 20778.929791480303), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17059.72723968327), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12247.527016088367), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12162.8245485425), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11649.190486937761), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11235.27798551321), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10887.697177901864), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10537.405863419175), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10427.963113591075), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10388.50929298997), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10345.741146698594), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10325.746990069747), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10308.259309798479), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9878.29905757308), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9860.522675991058), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9831.829701200128), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9413.955781325698), ('&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9256.14368981123), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9233.180386424065), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9178.422535061836)]
[0.8302521387863329, 0.8277260767493848]
************************************ 3 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999503    valid_1's auc: 0.812223
Early stopping, best iteration is:
[4737]  training's auc: 0.999988    valid_1's auc: 0.826507
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 21187.639979198575), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17066.7826115638), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12178.71656690538), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11915.060246050358), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11457.53249040246), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11197.47149656713), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11062.857962206006), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10535.98642912507), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10396.114720955491), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10280.928569793701), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10159.540036082268), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10114.058793380857), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10109.089174315333), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10081.144412502646), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 10064.824367910624), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9710.811524420977), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9568.110130429268), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9536.692147105932), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9272.926451265812), ('&#x8FC7;&#x53BB;&#x4E09;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9104.1763061136)]
[0.8302521387863329, 0.8277260767493848, 0.8265070441159572]
************************************ 4 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999558    valid_1's auc: 0.812316
Early stopping, best iteration is:
[4955]  training's auc: 0.999985    valid_1's auc: 0.82816
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 20919.606680095196), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17050.523352131248), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12673.502319052815), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12145.743713662028), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12082.749334529042), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11270.388913482428), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11032.332806184888), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10647.951857417822), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10599.385332718492), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10490.505580991507), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10461.154125005007), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10269.522361278534), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 10231.192073732615), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9965.85817475617), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9773.746473029256), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9764.829889595509), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9703.316017881036), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9595.259186178446), ('&#x5E73;&#x5747;&#x975E;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x547C;&#x53EB;&#x6570;', 9585.856355905533), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9195.526195570827)]
[0.8302521387863329, 0.8277260767493848, 0.8265070441159572, 0.8281604518378232]
************************************ 5 ************************************
Training until validation scores don't improve for 200 rounds.

[3000]  training's auc: 0.999494    valid_1's auc: 0.809363
Early stopping, best iteration is:
[4829]  training's auc: 0.999983    valid_1's auc: 0.824736
[('&#x5F53;&#x524D;&#x8BBE;&#x5907;&#x4F7F;&#x7528;&#x5929;&#x6570;', 20857.728651717305), ('&#x5F53;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 17141.65538044274), ('&#x5728;&#x804C;&#x603B;&#x6708;&#x6570;', 12623.7158523947), ('&#x6BCF;&#x6708;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 12155.711411625147), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 11755.307457834482), ('&#x5BA2;&#x6237;&#x6574;&#x4E2A;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 11121.649592876434), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x8D39;&#x7528;', 10800.35821519792), ('&#x5F53;&#x524D;&#x624B;&#x673A;&#x4EF7;&#x683C;', 10647.860997959971), ('&#x5DF2;&#x5B8C;&#x6210;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10567.15585295856), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 10455.313509970903), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x8D39;&#x7528;', 10241.350874692202), ('&#x5F53;&#x6708;&#x8D39;&#x7528;&#x4E0E;&#x524D;&#x4E09;&#x4E2A;&#x6708;&#x5E73;&#x5747;&#x503C;&#x7684;&#x767E;&#x5206;&#x6BD4;&#x53D8;&#x5316;', 10177.092842921615), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x7684;&#x603B;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 10139.20638936758), ('&#x4F7F;&#x7528;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x901A;&#x8BDD;&#x7684;&#x5E73;&#x5747;&#x4E0D;&#x5B8C;&#x6574;&#x5206;&#x949F;&#x6570;', 9981.980402067304), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x603B;&#x5206;&#x949F;&#x6570;', 9756.786857843399), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x4F7F;&#x7528;&#x5206;&#x949F;&#x6570;', 9725.03030230105), ('&#x5BA2;&#x6237;&#x751F;&#x547D;&#x5468;&#x671F;&#x5185;&#x5E73;&#x5747;&#x6708;&#x8D39;&#x7528;', 9604.02791416645), ('&#x8BA1;&#x8D39;&#x8C03;&#x6574;&#x540E;&#x7684;&#x547C;&#x53EB;&#x603B;&#x6570;', 9452.47144331038), ('&#x5E73;&#x5747;&#x975E;&#x9AD8;&#x5CF0;&#x8BED;&#x97F3;&#x547C;&#x53EB;&#x6570;', 9228.985016450286), ('&#x8FC7;&#x53BB;&#x516D;&#x4E2A;&#x6708;&#x7684;&#x5E73;&#x5747;&#x6BCF;&#x6708;&#x901A;&#x8BDD;&#x6B21;&#x6570;', 9228.196154907346)]
[0.8302521387863329, 0.8277260767493848, 0.8265070441159572, 0.8281604518378232, 0.8247357260735417]
lgb_scotrainre_list: [0.8302521387863329, 0.8277260767493848, 0.8265070441159572, 0.8281604518378232, 0.8247357260735417]
lgb_score_mean: 0.8274762875126079
lgb_score_std: 0.0018267969533472914

五、贝叶斯调参


from bayes_opt import BayesianOptimization
def LGB_bayesian(
    num_leaves,
    bagging_freq,
    learning_rate,
    feature_fraction,
    bagging_fraction,
    lambda_l1,
    lambda_l2,
    min_gain_to_split,
    max_depth):

    num_leaves = int(num_leaves)
    max_depth = int(max_depth)

    assert type(num_leaves) == int
    assert type(max_depth) == int

    param = {
        'num_leaves': num_leaves,
        'learning_rate': learning_rate,
        'bagging_fraction': bagging_fraction,
        'bagging_freq': bagging_freq,
        'feature_fraction': feature_fraction,
        'lambda_l1': lambda_l1,
        'lambda_l2': lambda_l2,
        'max_depth': max_depth,
        'objective': 'binary',
        'boosting_type': 'gbdt',
        'verbose': 1,
        'metric': 'auc',
        'seed': 2022,
        'feature_fraction_seed': 2022,
        'bagging_seed': 2022,
        'drop_seed': 2022,
        'data_random_seed': 2022,
        'is_unbalance': True,
        'boost_from_average': False,
        'save_binary': True,

    }
    lgb_train = lgb.Dataset(X_train,y_train,free_raw_data=False,silent=True)
    lgb_eval = lgb.Dataset(X_test,y_test,reference=lgb_train,free_raw_data=False,
                       silent=True)
    num_round=10000
    clf = lgb.train(param,lgb_train,num_round,valid_sets =lgb_eval,verbose_eval=500,early_stopping_rounds = 200)
    roc= roc_auc_score(y_test,clf.predict(X_test,num_iteration=clf.best_iteration))
    return roc

lgb_train = lgb.Dataset(X_train,y_train,free_raw_data=False,silent=True)
lgb_eval = lgb.Dataset(X_test,y_test,reference=lgb_train,free_raw_data=False,
                       silent=True)
bounds_LGB = {
    'num_leaves': (5,50),
    'learning_rate': (0.03,0.5),
    'feature_fraction': (0.1,1),
    'bagging_fraction': (0.1,1),
    'bagging_freq': (0,10),
    'lambda_l1': (0, 5.0),
    'lambda_l2': (0, 10),
    'min_gain_to_split': (0, 1.0),
    'max_depth':(5,15),
}
X_train,X_test,y_train,y_test=train_test_split(train.drop(labels=
      ['客户ID','是否流失','平均丢弃数据呼叫数'],axis=1),train['是否流失'],random_state=10,test_size=0.2)
from bayes_opt import BayesianOptimization
LGB_BO = BayesianOptimization(LGB_bayesian, bounds_LGB, random_state=13)
init_points = 5
n_iter = 15
print('-' * 130)

with warnings.catch_warnings():
    warnings.filterwarnings('ignore')
    LGB_BO.maximize(init_points=init_points, n_iter=n_iter, acq='ucb', xi=0.0)
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.756501
[1000]  valid_0's auc: 0.779654
[1500]  valid_0's auc: 0.795342
[2000]  valid_0's auc: 0.804397
[2500]  valid_0's auc: 0.812615
[3000]  valid_0's auc: 0.818713
[3500]  valid_0's auc: 0.82294
[4000]  valid_0's auc: 0.826771
[4500]  valid_0's auc: 0.82971
[5000]  valid_0's auc: 0.832648
Did not meet early stopping. Best iteration is:
[5000]  valid_0's auc: 0.832648
| [0m 1       [0m | [0m 0.8326  [0m | [0m 0.7999  [0m | [0m 2.375   [0m | [0m 0.8419  [0m | [0m 4.829   [0m | [0m 9.726   [0m | [0m 0.2431  [0m | [0m 11.09   [0m | [0m 0.7755  [0m | [0m 33.87   [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.735036
[1000]  valid_0's auc: 0.754152
[1500]  valid_0's auc: 0.767662
[2000]  valid_0's auc: 0.778614
[2500]  valid_0's auc: 0.786152
[3000]  valid_0's auc: 0.792418
[3500]  valid_0's auc: 0.79872
[4000]  valid_0's auc: 0.803314
[4500]  valid_0's auc: 0.807683
[5000]  valid_0's auc: 0.81121
Did not meet early stopping. Best iteration is:
[5000]  valid_0's auc: 0.81121
| [0m 2       [0m | [0m 0.8112  [0m | [0m 0.7498  [0m | [0m 0.3504  [0m | [0m 0.3686  [0m | [0m 0.2926  [0m | [0m 8.571   [0m | [0m 0.2052  [0m | [0m 11.8    [0m | [0m 0.2563  [0m | [0m 20.64   [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.749544
[1000]  valid_0's auc: 0.774443
[1500]  valid_0's auc: 0.78951
[2000]  valid_0's auc: 0.800602
[2500]  valid_0's auc: 0.80823
[3000]  valid_0's auc: 0.814024
[3500]  valid_0's auc: 0.81853
[4000]  valid_0's auc: 0.821672
[4500]  valid_0's auc: 0.823975
[5000]  valid_0's auc: 0.826105
Did not meet early stopping. Best iteration is:
[5000]  valid_0's auc: 0.826105
| [0m 3       [0m | [0m 0.8261  [0m | [0m 0.1085  [0m | [0m 3.583   [0m | [0m 0.9542  [0m | [0m 1.089   [0m | [0m 3.194   [0m | [0m 0.4614  [0m | [0m 5.319   [0m | [0m 0.06508 [0m | [0m 33.34   [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.776662
[1000]  valid_0's auc: 0.804675
[1500]  valid_0's auc: 0.817655
[2000]  valid_0's auc: 0.826085
[2500]  valid_0's auc: 0.831839
[3000]  valid_0's auc: 0.835281
Early stopping, best iteration is:
[3179]  valid_0's auc: 0.836292
| [95m 4       [0m | [95m 0.8363  [0m | [95m 0.8864  [0m | [95m 0.08716 [0m | [95m 0.7719  [0m | [95m 4.064   [0m | [95m 0.7572  [0m | [95m 0.3385  [0m | [95m 10.09   [0m | [95m 0.4799  [0m | [95m 48.0    [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.751437
[1000]  valid_0's auc: 0.777091
[1500]  valid_0's auc: 0.793125
[2000]  valid_0's auc: 0.805084
[2500]  valid_0's auc: 0.812527
[3000]  valid_0's auc: 0.81902
[3500]  valid_0's auc: 0.823788
[4000]  valid_0's auc: 0.827882
[4500]  valid_0's auc: 0.831144
[5000]  valid_0's auc: 0.834175
Did not meet early stopping. Best iteration is:
[5000]  valid_0's auc: 0.834175
| [0m 5       [0m | [0m 0.8342  [0m | [0m 0.1     [0m | [0m 2.47    [0m | [0m 0.741   [0m | [0m 1.623   [0m | [0m 2.77    [0m | [0m 0.3569  [0m | [0m 14.19   [0m | [0m 0.2445  [0m | [0m 25.61   [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.731224
[1000]  valid_0's auc: 0.749423
[1500]  valid_0's auc: 0.760884
[2000]  valid_0's auc: 0.76975
[2500]  valid_0's auc: 0.777677
[3000]  valid_0's auc: 0.785282
[3500]  valid_0's auc: 0.791667
[4000]  valid_0's auc: 0.796303
[4500]  valid_0's auc: 0.800412
[5000]  valid_0's auc: 0.804301
Did not meet early stopping. Best iteration is:
[5000]  valid_0's auc: 0.804301
| [0m 6       [0m | [0m 0.8043  [0m | [0m 0.6073  [0m | [0m 1.211   [0m | [0m 0.312   [0m | [0m 0.3293  [0m | [0m 7.263   [0m | [0m 0.2122  [0m | [0m 7.399   [0m | [0m 0.2959  [0m | [0m 19.23   [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.758604
[1000]  valid_0's auc: 0.783179
[1500]  valid_0's auc: 0.798202
[2000]  valid_0's auc: 0.808477
[2500]  valid_0's auc: 0.816619
[3000]  valid_0's auc: 0.821904
[3500]  valid_0's auc: 0.825642
[4000]  valid_0's auc: 0.828837
[4500]  valid_0's auc: 0.83184
[5000]  valid_0's auc: 0.833605
Did not meet early stopping. Best iteration is:
[4998]  valid_0's auc: 0.833608
| [0m 7       [0m | [0m 0.8336  [0m | [0m 0.5285  [0m | [0m 0.4363  [0m | [0m 0.5314  [0m | [0m 4.917   [0m | [0m 0.0     [0m | [0m 0.2589  [0m | [0m 15.0    [0m | [0m 0.5876  [0m | [0m 36.33   [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.775361
[1000]  valid_0's auc: 0.802306
[1500]  valid_0's auc: 0.816144
[2000]  valid_0's auc: 0.823617
[2500]  valid_0's auc: 0.828743
[3000]  valid_0's auc: 0.831524
Early stopping, best iteration is:
[3085]  valid_0's auc: 0.831879
| [0m 8       [0m | [0m 0.8319  [0m | [0m 1.0     [0m | [0m 8.511   [0m | [0m 1.0     [0m | [0m 4.544   [0m | [0m 7.213   [0m | [0m 0.4367  [0m | [0m 15.0    [0m | [0m 1.0     [0m | [0m 45.52   [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.759645
[1000]  valid_0's auc: 0.786561
[1500]  valid_0's auc: 0.802323
[2000]  valid_0's auc: 0.81118
[2500]  valid_0's auc: 0.817364
[3000]  valid_0's auc: 0.821898
[3500]  valid_0's auc: 0.824679
Early stopping, best iteration is:
[3739]  valid_0's auc: 0.826167
| [0m 9       [0m | [0m 0.8262  [0m | [0m 0.1     [0m | [0m 10.0    [0m | [0m 1.0     [0m | [0m 5.0     [0m | [0m 0.0     [0m | [0m 0.5     [0m | [0m 15.0    [0m | [0m 1.0     [0m | [0m 30.18   [0m |
Training until validation scores don't improve for 200 rounds.

[500]   valid_0's auc: 0.708925
[1000]  valid_0's auc: 0.721584
[1500]  valid_0's auc: 0.729905
[2000]  valid_0's auc: 0.736464
[2500]  valid_0's auc: 0.741614
[3000]  valid_0's auc: 0.746397
[3500]  valid_0's auc: 0.750703
[4000]  valid_0's auc: 0.754139
[4500]  valid_0's auc: 0.757474
[5000]  valid_0's auc: 0.760932
Did not meet early stopping. Best iteration is:
[5000]  valid_0's auc: 0.760932
| [0m 10      [0m | [0m 0.7609  [0m | [0m 0.1     [0m | [0m 0.0     [0m | [0m 0.1     [0m | [0m 0.0     [0m | [0m 7.904   [0m | [0m 0.03    [0m | [0m 15.0    [0m | [0m 0.0     [0m | [0m 43.18   [0m |
=====================================================================================================================================
print(LGB_BO.max['target'])
LGB_BO.max['params']
0.8362916622722081

{'bagging_fraction': 0.8864320989515848,
 'bagging_freq': 0.08715732303784862,
 'feature_fraction': 0.7719195132945438,
 'lambda_l1': 4.0642058550131175,
 'lambda_l2': 0.7571744617226672,
 'learning_rate': 0.33853400726057015,
 'max_depth': 10.092622000835181,
 'min_gain_to_split': 0.47988339149638315,
 'num_leaves': 48.00083652189798}
  • BayesianOptimization库中还有一个很酷的选项。 你可以探测LGB_bayesian函数,如果你对最佳参数有所了解,或者您从其他kernel获取参数。 我将在此复制并粘贴其他内核中的参数。 你可以按照以下方式进行探测
  • 默认情况下这些将被懒惰地探索(lazy = True),这意味着只有在你下次调用maxime时才会评估这些点。 让我们对LGB_BO对象进行最大化调用。
LGB_BO.probe(
    params={'bagging_fraction': 0.8864320989515848,
        'bagging_freq': 0.08715732303784862,
        'feature_fraction': 0.7719195132945438,
        'lambda_l1': 4.0642058550131175,
        'lambda_l2': 0.7571744617226672,
        'learning_rate': 0.33853400726057015,
        'max_depth': 10,
        'min_gain_to_split': 0.47988339149638315,
        'num_leaves': 48},
    lazy=True,
)
LGB_BO.maximize(init_points=0, n_iter=0)

`
| iter | target | baggin… | baggin… | featur… | lambda_l1 | lambda_l2 | learni… | max_depth | min_ga… | num_le… |

Original: https://blog.csdn.net/m0_64375823/article/details/125324791
Author: 读书不觉已春深!
Title: 科大讯飞:电信客户流失预测挑战赛baseline

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/640154/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球