ML之LiR&2PolyR&4PolyR:使用线性回归LiR、二次多项式回归2PolyR、四次多项式回归4PolyR模型在披萨数据集上拟合(train)、价格回归预测(test)

简介: ML之LiR&2PolyR&4PolyR:使用线性回归LiR、二次多项式回归2PolyR、四次多项式回归4PolyR模型在披萨数据集上拟合(train)、价格回归预测(test)

输出结

image.png

image.png

 

设计思

image.png

 

核心代

poly4 = PolynomialFeatures(degree=4)

X_train_poly4 = poly4.fit_transform(X_train)

r_poly4 = LinearRegression()

r_poly4 .fit(X_train_poly4, y_train)

x_poly4 = poly4.transform(xx)

poly4 = r_poly4 .predict(xx_poly4)



class PolynomialFeatures(BaseEstimator, TransformerMixin):

   """Generate polynomial and interaction features.

 

   Generate a new feature matrix consisting of all polynomial combinations

   of the features with degree less than or equal to the specified degree.

   For example, if an input sample is two dimensional and of the form

   [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2].

 

   Parameters

   ----------

   degree : integer

   The degree of the polynomial features. Default = 2.

 

   interaction_only : boolean, default = False

   If true, only interaction features are produced: features that are

   products of at most ``degree`` *distinct* input features (so not

   ``x[1] ** 2``, ``x[0] * x[2] ** 3``, etc.).

 

   include_bias : boolean

   If True (default), then include a bias column, the feature in which

   all polynomial powers are zero (i.e. a column of ones - acts as an

   intercept term in a linear model).

 

   Examples

   --------

   >>> X = np.arange(6).reshape(3, 2)

   >>> X

   array([[0, 1],

   [2, 3],

   [4, 5]])

   >>> poly = PolynomialFeatures(2)

   >>> poly.fit_transform(X)

   array([[  1.,   0.,   1.,   0.,   0.,   1.],

   [  1.,   2.,   3.,   4.,   6.,   9.],

   [  1.,   4.,   5.,  16.,  20.,  25.]])

   >>> poly = PolynomialFeatures(interaction_only=True)

   >>> poly.fit_transform(X)

   array([[  1.,   0.,   1.,   0.],

   [  1.,   2.,   3.,   6.],

   [  1.,   4.,   5.,  20.]])

 

   Attributes

   ----------

   powers_ : array, shape (n_output_features, n_input_features)

   powers_[i, j] is the exponent of the jth input in the ith output.

 

   n_input_features_ : int

   The total number of input features.

 

   n_output_features_ : int

   The total number of polynomial output features. The number of output

   features is computed by iterating over all suitably sized combinations

   of input features.

 

   Notes

   -----

   Be aware that the number of features in the output array scales

   polynomially in the number of features of the input array, and

   exponentially in the degree. High degrees can cause overfitting.

 

   See :ref:`examples/linear_model/plot_polynomial_interpolation.py

   <sphx_glr_auto_examples_linear_model_plot_polynomial_interpolation.

    py>`

   """

   def __init__(self, degree=2, interaction_only=False, include_bias=True):

       self.degree = degree

       self.interaction_only = interaction_only

       self.include_bias = include_bias

 

   @staticmethod

   def _combinations(n_features, degree, interaction_only, include_bias):

       comb = combinations if interaction_only else combinations_w_r

       start = int(not include_bias)

       return chain.from_iterable(comb(range(n_features), i) for

           i in range(start, degree + 1))

 

   @property

   def powers_(self):

       check_is_fitted(self, 'n_input_features_')

       combinations = self._combinations(self.n_input_features_, self.

        degree,

           self.interaction_only,

           self.include_bias)

       return np.vstack(np.bincount(c, minlength=self.n_input_features_) for

           c in combinations)

 

   def get_feature_names(self, input_features=None):

       """

       Return feature names for output features

       Parameters

       ----------

       input_features : list of string, length n_features, optional

           String names for input features if available. By default,

           "x0", "x1", ... "xn_features" is used.

       Returns

       -------

       output_feature_names : list of string, length n_output_features

       """

       powers = self.powers_

       if input_features is None:

           input_features = ['x%d' % i for i in range(powers.shape[1])]

       feature_names = []

       for row in powers:

           inds = np.where(row)[0]

           if len(inds):

               name = " ".join(

                   "%s^%d" % (input_features[ind], exp) if exp != 1 else

                    input_features[ind] for

                   (ind, exp) in zip(inds, row[inds]))

           else:

               name = "1"

           feature_names.append(name)

     

       return feature_names

 

   def fit(self, X, y=None):

       """

       Compute number of output features.

       Parameters

       ----------

       X : array-like, shape (n_samples, n_features)

           The data.

       Returns

       -------

       self : instance

       """

       n_samples, n_features = check_array(X).shape

       combinations = self._combinations(n_features, self.degree,

           self.interaction_only,

           self.include_bias)

       self.n_input_features_ = n_features

       self.n_output_features_ = sum(1 for _ in combinations)

       return self

 

   def transform(self, X):

       """Transform data to polynomial features

       Parameters

       ----------

       X : array-like, shape [n_samples, n_features]

           The data to transform, row by row.

       Returns

       -------

       XP : np.ndarray shape [n_samples, NP]

           The matrix of features, where NP is the number of polynomial

           features generated from the combination of inputs.

       """

       check_is_fitted(self, ['n_input_features_', 'n_output_features_'])

       X = check_array(X, dtype=FLOAT_DTYPES)

       n_samples, n_features = X.shape

       if n_features != self.n_input_features_:

           raise ValueError("X shape does not match training shape")

       # allocate output data

       XP = np.empty((n_samples, self.n_output_features_), dtype=X.dtype)

       combinations = self._combinations(n_features, self.degree,

           self.interaction_only,

           self.include_bias)

       for i, c in enumerate(combinations):

           :i]XP[ = X[:c].prod(1)

     

       return XP


 

相关文章
|
SQL 关系型数据库 Oracle
ORA-01466: unable to read data - table definition has changed
1. Oracle建议我们等待大约5分钟之后再进行flashback query新创建的表,否则可能会碰到这个错误ORA-01466: unable to read data - table definition has changed.
1936 0
|
前端开发
刮刮乐,前端代码html+js实现,直接运行
刮刮乐,前端代码html+js实现,直接运行
724 0
刮刮乐,前端代码html+js实现,直接运行
|
机器学习/深度学习 数据采集 人工智能
机器学习实战 | 综合项目-电商销量预估
本篇内容基于Kaggle数据科学竞赛Rossmann store sales,梳理和总结基于Python解决电商建模的全过程:包括数据探索分析、数据预处理与特征工程、建模与调优。
4847 1
机器学习实战 | 综合项目-电商销量预估
|
缓存 监控 前端开发
Spring boot属性文件加载和生效顺序深度分析
spring boot最核心的特性就是自动化配置,我们在学习spring boot的时候,首要需要了解它的自动化配置原理,其次是属性文件的加载顺序,我认为这两点是学习spring boot的重中之中。
1168 0
Spring boot属性文件加载和生效顺序深度分析
|
安全 机器人 数据挖掘
虚拟数字机器人仿真测试验证平台(最大支持12个机器人关节)
虚拟数字机器人仿真测试验证平台(C6657+FPGA架构)
虚拟数字机器人仿真测试验证平台(最大支持12个机器人关节)
|
存储 传感器 人工智能
开发环境配置整理大全——Visual Studio 2022安装篇
Visual Studio是微软的集成开发环境(IDE),以Windows为主的平台开发的一套功能全面而强大的IDE,支持C#、F#、VB、C/C++、HTML等36 种语言的开发。 开发人员常使用的开发工具之一,Visual Studio今年出了最新版本的2022款,像我们之前常使用的是Visual Studio 2019。2022版本还是挺香的。
开发环境配置整理大全——Visual Studio 2022安装篇
|
存储 JavaScript 前端开发
Angular数据状态管理框架:NgRx/Store
ngrx/store 是基于RxJS的状态管理库,其灵感来源于Redux。在NgRx中,状态是由一个包含action和reducer的函数的映射组成的。Reducer函数经由action的分发以及当前或初始的状态而被调用,最后由reducer返回一个不可变的状态。
719 0
Angular数据状态管理框架:NgRx/Store
|
机器学习/深度学习 Java vr&ar
GCAN:可解释的社交媒体假新闻检测方法
GCAN:可解释的社交媒体假新闻检测方法
570 0
GCAN:可解释的社交媒体假新闻检测方法
|
SQL 机器学习/深度学习 自然语言处理
”想知道你家爱豆最近的演唱会?"让Text2SQL模型自动帮你回答!
”想知道你家爱豆最近的演唱会?"让Text2SQL模型自动帮你回答!