lightgbm入门学习第一笔记

简介: lightgbm入门学习第一笔记

1.载入python包


from pathlib import Path
# 画图的
import matplotlib.pyplot as plt
%matplotlib inline
# 读取数据的
import pandas as pd
# 计算用的
import lightgbm as lgb
# 忽略警告的
import warnings
warnings.filterwarnings('ignore')
try:
    # To enable interactive mode you should install ipywidgets
    # https://github.com/jupyter-widgets/ipywidgets
    from ipywidgets import interact, SelectMultiple
    INTERACTIVE = True
except ImportError:
    INTERACTIVE = False


2.加载数据


# 读取当前目录
print(Path().absolute())
# 向上退一级目录
print(Path().absolute().parents[1])
regression_example_dir = Path().absolute().parents[1] / 'regression'
# pandas读取数据
df_train = pd.read_csv(str(regression_example_dir / 'regression.train'), header=None, sep='\t')
df_test = pd.read_csv(str(regression_example_dir / 'regression.test'), header=None, sep='\t')
y_train = df_train[0]
y_test = df_test[0]
X_train = df_train.drop(0, axis=1)
X_test = df_test.drop(0, axis=1)
C:\Users\Administrator\Desktop\LightGBM-master\examples\python-guide\notebooks
C:\Users\Administrator\Desktop\LightGBM-master\examples
df_train.head()
.dataframe tbody tr th:only-of-type {         vertical-align: middle;     } .dataframe tbody tr th {     vertical-align: top; } .dataframe thead th {     text-align: right; }

0 1 2 3 4 5 6 7 8 9 ... 19 20 21 22 23 24 25 26 27 28
0 1 0.869 -0.635 0.226 0.327 -0.690 0.754 -0.249 -1.092 0.000 ... -0.010 -0.046 3.102 1.354 0.980 0.978 0.920 0.722 0.989 0.877
1 1 0.908 0.329 0.359 1.498 -0.313 1.096 -0.558 -1.588 2.173 ... -1.139 -0.001 0.000 0.302 0.833 0.986 0.978 0.780 0.992 0.798
2 1 0.799 1.471 -1.636 0.454 0.426 1.105 1.282 1.382 0.000 ... 1.129 0.900 0.000 0.910 1.108 0.986 0.951 0.803 0.866 0.780
3 0 1.344 -0.877 0.936 1.992 0.882 1.786 -1.647 -0.942 0.000 ... -0.678 -1.360 0.000 0.947 1.029 0.999 0.728 0.869 1.027 0.958
4 1 1.105 0.321 1.522 0.883 -1.205 0.681 -1.070 -0.922 0.000 ... -0.374 0.113 0.000 0.756 1.361 0.987 0.838 1.133 0.872 0.808

5 rows × 29 columns

df_test.head()
.dataframe tbody tr th:only-of-type {         vertical-align: middle;     } .dataframe tbody tr th {     vertical-align: top; } .dataframe thead th {     text-align: right; }

0 1 2 3 4 5 6 7 8 9 ... 19 20 21 22 23 24 25 26 27 28
0 1 0.644 0.247 -0.447 0.862 0.374 0.854 -1.126 -0.790 2.173 ... -0.190 -0.744 3.102 0.958 1.061 0.980 0.875 0.581 0.905 0.796
1 0 0.385 1.800 1.037 1.044 0.349 1.502 -0.966 1.734 0.000 ... -0.440 0.638 3.102 0.695 0.909 0.981 0.803 0.813 1.149 1.116
2 0 1.214 -0.166 0.004 0.505 1.434 0.628 -1.174 -1.230 1.087 ... -1.383 1.355 0.000 0.848 0.911 1.043 0.931 1.058 0.744 0.696
3 1 0.420 1.111 0.137 1.516 -1.657 0.854 0.623 1.605 1.087 ... 0.731 1.424 3.102 1.597 1.282 1.105 0.730 0.148 1.231 1.234
4 0 0.897 -1.703 -1.306 1.022 -0.729 0.836 0.859 -0.333 2.173 ... -2.019 -0.289 0.000 0.805 0.930 0.984 1.430 2.198 1.934 1.684

5 rows × 29 columns


3.LightGBM数据集创建


classlightgbm.Dataset(data, label=None, reference=None, weight=None, group=None, init_score=None, feature_name='auto', categorical_feature='auto', params=None, free_raw_data=True)
  • data (str, pathlib.Path, numpy array, pandas DataFrame, H2O DataTable's Frame, scipy.sparse, Sequence, list of Sequence or list of numpy array) – Data source of Dataset. If str or pathlib.Path, it represents the path to a text file (CSV, TSV, or LibSVM) or a LightGBM Dataset binary file.
  • label (list, numpy 1-D array, pandas Series / one-column DataFrame or None, optional (default=None)) – Label of the data.
  • reference (Dataset or None, optional (default=None)) – If this is Dataset for validation, training data should be used as reference.
  • weight (list, numpy 1-D array, pandas Series or None, optional (default=None)) – Weight for each instance. Weights should be non-negative.
  • group (list, numpy 1-D array, pandas Series or None, optional (default=None)) – Group/query data. Only used in the learning-to-rank task. sum(group) = n_samples. For example, if you have a 100-document dataset with group = [10, 20, 40, 10, 10, 10], that means that you have 6 groups, where the first 10 records are in the first group, records 11-30 are in the second group, records 31-70 are in the third group, etc.
  • init_score (list, list of lists (for multi-class task), numpy array, pandas Series, pandas DataFrame (for multi-class task), or None, optional (default=None)) – Init score for Dataset.
  • feature_name (list of str, or 'auto', optional (default="auto")) – Feature names. If ‘auto’ and data is pandas DataFrame, data columns names are used.
  • categorical_feature (list of str or int, or 'auto', optional (default="auto")) – Categorical features. If list of int, interpreted as indices. If list of str, interpreted as feature names (need to specify feature_name as well). If ‘auto’ and data is pandas DataFrame, pandas unordered categorical columns are used. All values in categorical features will be cast to int32 and thus should be less than int32 max value (2147483647). Large values could be memory consuming. Consider using consecutive integers starting from zero. All negative values in categorical features will be treated as missing values. The output cannot be monotonically constrained with respect to a categorical feature. Floating point numbers in categorical features will be rounded towards 0.
  • params (dict or None, optional (default=None)) – Other parameters for Dataset.
  • free_raw_data (bool, optional (default=True)) – If True, raw data is freed after constructing inner Dataset.
lgb_train = lgb.Dataset(X_train, y_train)
lgb_test = lgb.Dataset(X_test, y_test, reference=lgb_train)


4.创建配置


params = {
    'num_leaves': 5,
    'metric': ['l1', 'l2'],
    'verbose': -1
}


5.开始训练


  • train(params, train_set[, num_boost_round, ...]) Perform the training with given parameters.
evals_result = {}  # 记录eval结果用于画图
gbm = lgb.train(params,
                lgb_train,
                num_boost_round=100,
                valid_sets=[lgb_train, lgb_test],
                feature_name=[f'f{i + 1}' for i in range(X_train.shape[-1])],
                categorical_feature=[21],
                callbacks=[
                    lgb.log_evaluation(10),
                    lgb.record_evaluation(evals_result)
                ])
[10]  training's l1: 0.457448 training's l2: 0.217995 valid_1's l1: 0.456464  valid_1's l2: 0.21641
[20]  training's l1: 0.436869 training's l2: 0.205099 valid_1's l1: 0.434057  valid_1's l2: 0.201616
[30]  training's l1: 0.421302 training's l2: 0.197421 valid_1's l1: 0.417019  valid_1's l2: 0.192514
[40]  training's l1: 0.411107 training's l2: 0.192856 valid_1's l1: 0.406303  valid_1's l2: 0.187258
[50]  training's l1: 0.403695 training's l2: 0.189593 valid_1's l1: 0.398997  valid_1's l2: 0.183688
[60]  training's l1: 0.398704 training's l2: 0.187043 valid_1's l1: 0.393977  valid_1's l2: 0.181009
[70]  training's l1: 0.394876 training's l2: 0.184982 valid_1's l1: 0.389805  valid_1's l2: 0.178803
[80]  training's l1: 0.391147 training's l2: 0.1828 valid_1's l1: 0.386476  valid_1's l2: 0.176799
[90]  training's l1: 0.388101 training's l2: 0.180817 valid_1's l1: 0.384404  valid_1's l2: 0.175775
[100] training's l1: 0.385174 training's l2: 0.179171 valid_1's l1: 0.382929  valid_1's l2: 0.175321


6.查看训练过程


def render_metric(metric_name):
    ax = lgb.plot_metric(evals_result, metric=metric_name, figsize=(10, 5))
    plt.show()
if INTERACTIVE:
    # create widget to switch between metrics
    interact(render_metric, metric_name=params['metric'])
else:
    render_metric(params['metric'][0])
interactive(children=(Dropdown(description='metric_name', options=('l1', 'l2'), value='l1'), Output()), _dom_c…


7.画特征图


def render_plot_importance(importance_type, max_features=10,
                           ignore_zero=True, precision=3):
    ax = lgb.plot_importance(gbm, importance_type=importance_type,
                             max_num_features=max_features,
                             ignore_zero=ignore_zero, figsize=(12, 8),
                             precision=precision)
    plt.show()
if INTERACTIVE:
    # create widget for interactive feature importance plot
    interact(render_plot_importance,
             importance_type=['split', 'gain'],
             max_features=(1, X_train.shape[-1]),
             precision=(0, 10))
else:
    render_plot_importance(importance_type='split')
interactive(children=(Dropdown(description='importance_type', options=('split', 'gain'), value='split'), IntSl…


8.特征值分布图


def render_histogram(feature):
    ax = lgb.plot_split_value_histogram(gbm, feature=feature,
                                        bins='auto', figsize=(10, 5))
    plt.show()
if INTERACTIVE:
    # create widget for interactive split value histogram
    interact(render_histogram,
             feature=gbm.feature_name())
else:
    render_histogram(feature='f26')
interactive(children=(Dropdown(description='feature', options=('f1', 'f2', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8',…


9.画树图


9.1graphviz安装


pip install graphviz


9.2 win下graphviz安装


下载地址 gitlab.com/api/v4/proj…

安装并设置环境变量

import os
os.environ["PATH"] += os.pathsep + r'C:\Program Files (x86)\Graphviz\bin/'
def render_tree(tree_index, show_info, precision=3):
    show_info = None if 'None' in show_info else show_info
    return lgb.create_tree_digraph(gbm, tree_index=tree_index,
import os
os.environ["PATH"] += os.pathsep + r'C:\Program Files\Graphviz\bin/'
if INTERACTIVE:
    # create widget to switch between trees and control info in nodes
    interact(render_tree,
             tree_index=(0, gbm.num_trees() - 1),
             show_info=SelectMultiple(  # allow multiple values to be selected
                 options=['None',
                          'split_gain',
                          'internal_value',
                          'internal_count',
                          'internal_weight',
                          'leaf_count',
                          'leaf_weight',
                          'data_percentage'],
                 value=['None']),
             precision=(0, 10))
    tree = None
else:
    tree = render_tree(53, ['None'])
tree
interactive(children=(IntSlider(value=49, description='tree_index', max=99), SelectMultiple(description='show_…


10.模型保存


# 模型保存
gbm.save_model('model.txt')
<lightgbm.basic.Booster at 0x247fb2265e0>


11.预测


y_pred = gbm.predict(X_test, num_iteration=gbm.best_iteration)
print(y_pred)
[ 6.33479642e-01  4.25887937e-01  2.56217117e-01  4.86929519e-01  2.33292447e-01  1.63434010e-01  3.68667133e-01  4.09774178e-01  7.04604027e-01  5.02551497e-01  6.40686199e-01  7.19385156e-01  7.41902740e-01  6.96221985e-01  4.40294282e-01  7.37699070e-01  4.08883945e-01  5.36636773e-01  7.08475765e-01  5.20571445e-01  7.10604143e-01  5.26923953e-01  5.78733282e-01  5.14174117e-01  4.11320699e-01  6.53895132e-01  6.43825483e-01  7.84754801e-01  4.01373544e-01  6.46254968e-01  5.77817060e-01  5.47259514e-01  4.71219971e-01  6.44547348e-01  5.26154488e-01  3.37513911e-01  2.05669407e-01  2.06145013e-01  6.24416038e-01  7.37623316e-01 -2.58819554e-04  4.79166726e-01  3.44027673e-01  5.20347726e-01  3.63943230e-01 -5.32882589e-03  5.52210841e-01  5.24886584e-01  8.26508593e-01  4.26354797e-01  1.11591699e+00  4.74355146e-01  1.20555073e-01  9.08264190e-01  7.30272709e-01  5.19243790e-01  2.05566261e-01  4.52380730e-01  5.60240917e-01  6.57711162e-01  4.18448166e-01  7.72577358e-01  7.62705553e-01  3.57582089e-01  6.86433592e-01  3.55394358e-01  6.45050591e-01  1.78149230e-01  8.13064495e-01  6.90434809e-01  3.98568149e-01  7.49444675e-01  8.03400362e-01  2.80792590e-01  8.37334643e-01  8.42950872e-01  3.75903502e-01  5.46412170e-01  8.21602639e-01  4.95276352e-01  3.14823086e-01  6.42259773e-01  7.75677625e-01  8.38889852e-01  1.62250988e-01  7.54015909e-01  5.92549060e-01  6.21280723e-01  4.33799559e-01  6.30701280e-01  3.91863126e-01  3.24147919e-01  4.38905589e-01  5.57890721e-01  5.20962492e-01  7.66845263e-01  3.02650380e-01  1.87034030e-01  8.84502576e-01  1.67438512e-01  4.71943755e-01  4.46656370e-01  5.89327051e-01  2.71114672e-01  3.28554464e-01  8.41607666e-01  6.14197765e-01  6.21224704e-01  2.99919552e-01  3.33537812e-01  5.54302342e-01  6.06740120e-01  3.97077406e-01  2.51648855e-02  5.13863957e-01  1.04445064e+00  3.46554304e-01  8.29346567e-01  2.09741596e-01  5.19185239e-01  3.34880879e-01  7.52658355e-01  7.15034072e-01  5.89717543e-01  3.55840634e-01  6.40827149e-01  6.24056310e-01  4.74466221e-01  4.08941567e-01  5.92834501e-01  7.04916409e-01  4.97805554e-01  3.45694922e-01  3.55286154e-01  6.51943205e-01  8.53404327e-01  3.53576143e-01  8.41978567e-01  4.45335079e-01  5.19711068e-01  4.79440633e-01  5.50946397e-01  3.89575002e-01  5.09236565e-01  4.49776529e-01  2.64213341e-01  3.09399680e-01  6.56398594e-01  7.15026274e-01  2.90668166e-01  4.64332317e-01  6.63167362e-01  6.81758202e-01  3.54421235e-01  6.69146180e-01  6.20771418e-01  5.15806223e-01  5.30713745e-01  6.01431001e-01  3.07559043e-01  9.59358822e-01  8.14795706e-01  7.61612943e-01  6.74558258e-02  6.84204566e-01  3.75838640e-01  5.97677392e-01  8.05567575e-01  2.89311259e-01  9.47516951e-01  9.62208569e-01  3.06077249e-01  3.98820523e-01  7.22561270e-01  3.89382823e-01  6.82481436e-01  5.85777571e-01  4.73969313e-01  3.17099784e-01  1.62786423e-02  9.28805554e-02  4.35312005e-01  1.62981788e-01  3.33488493e-01  1.37538616e-01  8.69976045e-01  6.06629471e-01  5.48061659e-01  9.15549672e-01  4.59790384e-01  6.00812765e-01  2.31650100e-01  5.78019822e-01  3.59072379e-01  4.24190034e-01  7.46503330e-01  4.92974100e-01  4.13881427e-01  2.40509522e-01  5.92133544e-01  3.92611549e-01  1.53861351e-01  6.04505789e-01  1.99720258e-01  6.99974765e-01  1.92597930e-01  5.72096689e-01  4.83427643e-01  8.18670175e-01  2.93183045e-01  4.48306984e-02  7.71798929e-01  4.41011923e-01  3.32238273e-01  5.74983931e-01  3.09891315e-01  8.47782186e-01  4.87507926e-01  4.21680441e-01  4.67273852e-01  7.11907771e-01  2.15964274e-01  7.55342796e-01  6.03479333e-01  4.47199456e-01  1.03738758e+00  5.46845382e-01  5.69833595e-01  6.57568546e-01  3.26544957e-01  6.12572198e-01  2.88721290e-01  5.99573254e-01  5.98503670e-01  4.80925404e-01  4.42936828e-01  6.50282836e-01  4.35627017e-01  6.69963953e-01  6.01741465e-01  5.39032155e-01  1.89632639e-01  1.61598320e-01  3.49669360e-01  4.41367875e-01  2.76164840e-01  2.68309969e-01  7.03520925e-01  4.48102888e-01  2.00202346e-01  2.18110963e-01  5.16792491e-01  4.06454191e-01  8.81546665e-01  7.71701934e-01  4.78158613e-01  5.67138930e-01  8.45366996e-01  3.26615315e-01  3.80870623e-01  4.83237280e-01  4.23641252e-01  3.44510160e-01  6.41305298e-01  4.23626728e-01  6.69024329e-01  7.42055495e-01  2.92683694e-01  6.14328007e-01 -3.68470297e-02  3.25890161e-01  4.91349827e-01  3.60415452e-01  6.46256965e-01  7.73289796e-01  6.62003822e-01  4.37675022e-01  7.78272465e-01  4.67588479e-01  3.30493786e-01  4.83922011e-01  2.70396954e-01  2.22045665e-01  3.77279852e-01  4.03430723e-01  3.42700616e-01  3.51725965e-01  3.54871133e-01  3.90862583e-01  6.49789563e-01  7.38801144e-01  1.59306780e-01  4.21493538e-01  7.43716890e-01  2.92961678e-01  2.86227018e-01  2.68711439e-01  2.98343415e-01  4.03548755e-01  4.83247845e-01  3.61430698e-01  4.08706892e-01  4.46188917e-01  3.96745275e-01  3.72945669e-01  8.44879805e-01  3.85470247e-01  6.23814168e-01  5.92190938e-01  5.34525640e-01  7.15716233e-01  5.17309415e-01  4.37156637e-01  9.37123487e-01  2.87930061e-01  6.76085183e-01  4.75419457e-01  6.42998135e-01  6.29169319e-01  2.65030304e-01  5.51242740e-01  6.30204954e-01  4.47487179e-01  8.93424905e-01  9.15606316e-01  2.79803242e-01  2.71098323e-01  4.25292686e-01  3.12530588e-01  4.98295333e-01  3.98292363e-01  6.48594219e-01  5.75798390e-01  4.06543825e-01  2.01902925e-02  3.32055077e-01  5.37098558e-01  3.31121613e-01  9.66681614e-01  3.76179940e-01  7.16699017e-01  4.45738667e-01  5.82761456e-01  3.99956852e-01  7.12842532e-01  3.85965607e-01  5.68207526e-01  7.95473372e-01  3.85063328e-01  6.15900186e-01  6.41450856e-01  8.50585329e-01  6.97652594e-01  2.30194548e-01  4.66913590e-01  3.29228032e-01  4.84563854e-01  2.25052285e-01  9.04352553e-01  3.28065048e-01  2.54094103e-01  3.39910841e-01  6.24037062e-01  4.83310106e-01  3.30008683e-01  6.89600274e-01  4.67247879e-01  2.95067274e-01  3.12245822e-01  7.82927166e-01  4.18300283e-01 -2.57235162e-02  5.73030246e-01  7.22150648e-01  8.97854624e-01  6.88530388e-01  5.54530017e-01  5.41582864e-01  6.14265339e-01  1.59448357e-01  4.88603543e-01  8.68488192e-01  8.74196986e-01  3.96229659e-01  8.74277038e-01  6.75304616e-01  5.84603699e-01  4.92039502e-01  3.38159500e-01  4.61982991e-01  4.46062141e-01  4.65817114e-01  6.07446901e-01  5.00767145e-01  8.27301748e-01  3.47625866e-01  6.11007260e-01  7.36467474e-01  7.68005908e-01  2.49479597e-01  6.80701988e-01  6.43403743e-01  6.32443316e-01  7.04647895e-01  5.23572843e-01  8.82123994e-01  6.65484637e-01  8.40640498e-01  7.20916850e-01  2.76498614e-01  8.41627129e-01  4.33336973e-01  4.26323757e-01  8.72066599e-01  4.59959990e-01  5.19059765e-01  9.64819442e-01  4.33509174e-01  5.37756081e-01  4.41946705e-01  8.16990441e-01  4.33790288e-01  6.00526540e-01  8.75178569e-01  2.68953742e-01  6.44347962e-01  4.70963085e-01  5.52646015e-01  2.18118049e-01  3.96870713e-01  2.91110534e-01  5.62191673e-01  8.02750001e-01  9.22087286e-01  3.53827542e-01  6.85884451e-01  4.65114927e-01  5.96395922e-01  5.28037587e-01  8.85796589e-01  3.63259378e-01  6.00291109e-01  2.37146491e-01  6.13886042e-01  8.95054181e-01  8.33608076e-01  4.88892454e-01  7.87019209e-01  5.00058659e-01  7.15913488e-01  1.13429037e-01  4.11566084e-01  4.92317712e-01  6.14830892e-01  4.30842330e-01  3.51833957e-01  7.87238191e-01  3.34186183e-01  2.85357393e-01  4.69265545e-01  6.16605119e-01  4.40802936e-01  3.04184111e-01  5.41875948e-01  1.29971395e-01  7.78222288e-01  6.27903277e-01  6.08038577e-01  4.61289568e-01  5.51391866e-01  3.23979747e-01  5.27827811e-01  5.79608752e-01  5.43579353e-01  4.47888170e-01  5.64159242e-01  5.13248077e-01  5.72696510e-01  3.97392086e-01  4.91634195e-01  6.18278141e-01  8.85765337e-01  5.50707876e-01  7.34504063e-01  5.09513286e-01  6.92856337e-01  4.67934122e-01  6.11790217e-01  6.66984305e-01  6.02788571e-01  6.75515853e-01  3.48740944e-01  4.38187921e-01  3.16229197e-01  8.13288085e-01  8.55787527e-01  3.10834972e-01  5.84051349e-01  2.39132158e-01  3.94795155e-01]


12.评估


from sklearn.metrics import mean_squared_error
rmse_test = mean_squared_error(y_test, y_pred) ** 0.5
print(f'The RMSE of prediction is: {rmse_test}')
The RMSE of prediction is: 0.4187133172590796


目录
相关文章
|
机器学习/深度学习 数据采集 人工智能
机器学习实战 | SKLearn入门与简单应用案例
本篇内容介绍了SKLearn的核心板块,并通过SKLearn自带的数据集,讲解一个典型应用案例。
1142 0
机器学习实战 | SKLearn入门与简单应用案例
|
5月前
|
机器学习/深度学习 数据采集 算法
探索Scikit-learn:机器学习库的入门与进阶
本文介绍了Scikit-learn,一个基于Python的开源机器学习库,它依赖于NumPy、SciPy和Matplotlib。Scikit-learn提供分类、回归、聚类和降维等算法,具有简单易用的API。文章涵盖了Scikit-learn的安装、基础使用,例如线性回归示例,并讨论了其主要功能,如数据预处理、模型评估、分类和回归算法、聚类和降维。此外,还涉及进阶应用,如集成学习(随机森林、AdaBoost等)和模型选择工具(交叉验证、网格搜索),帮助用户优化模型性能。总之,Scikit-learn是数据科学家在处理复杂问题时的强大工具。【6月更文挑战第7天】
81 7
|
6月前
|
机器学习/深度学习 算法 前端开发
【Python机器学习专栏】集成学习算法的原理与应用
【4月更文挑战第30天】集成学习通过组合多个基学习器提升预测准确性,广泛应用于分类、回归等问题。主要步骤包括生成基学习器、训练和结合预测结果。算法类型有Bagging(如随机森林)、Boosting(如AdaBoost)和Stacking。Python中可使用scikit-learn实现,如示例代码展示的随机森林分类。集成学习能降低模型方差,缓解过拟合,提高预测性能。
111 3
|
6月前
|
机器学习/深度学习 算法 Python
深入浅出Python机器学习:从零开始的SVM教程/厾罗
深入浅出Python机器学习:从零开始的SVM教程/厾罗
|
机器学习/深度学习 存储 算法
python机器学习课程——决策树全网最详解超详细笔记附代码
决策树算法是一种逼近离散函数值的方法。它是一种典型的分类方法,首先对数据进行处理,利用归纳算法生成可读的规则和决策树,然后使用决策对新数据进行分析。本质上决策树是通过一系列规则对数据进行分类的过程。决策树方法最早产生于上世纪60年代,到70年代末。由J Ross Quinlan提出了ID3算法,此算法的目的在于减少树的深度。但是忽略了叶子数目的研究。C4.5算法在ID3算法的基础上进行了改进,对于预测变量的缺值处理、剪枝技术、派生规则等方面作了较大改进,既适合于分类问题,又适合于回归问题。决策树算法构造决策
364 0
|
机器学习/深度学习 算法 PyTorch
从零开始学Pytorch(十四)之优化算法进阶(二)
从零开始学Pytorch(十四)之优化算法进阶
从零开始学Pytorch(十四)之优化算法进阶(二)
|
算法 PyTorch 算法框架/工具
从零开始学Pytorch(十四)之优化算法进阶(一)
从零开始学Pytorch(十四)之优化算法进阶
从零开始学Pytorch(十四)之优化算法进阶(一)
|
机器学习/深度学习 算法
GBDT入门学习
决策树A decision tree is a machine learning model that builds upon iteratively asking questions to partition data and reach a solution.结点:feature分支:决策叶子:结果痛点:过拟合     即在validation dataset 上表现好,但在test data
GBDT入门学习
|
机器学习/深度学习 数据挖掘 领域建模
机器学习神器Scikit-Learn入门教程
Scikit-learn是一个非常知名的Python机器学习库,它广泛地用于统计分析和机器学习建模等数据科学领域。
|
机器学习/深度学习 Python
快速上手逻辑回归模型?读这篇就够了(Python机器学习)
快速上手逻辑回归模型?读这篇就够了(Python机器学习)
936 0
快速上手逻辑回归模型?读这篇就够了(Python机器学习)