Part7__机器学习实战学习笔记__Regression-阿里云开发者社区

Part7__机器学习实战学习笔记__Regression

2022-01-25 356

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 本文主要通过：普通最小二乘发线性回归(OLS)、局部加权线性回归(LWLR)和分类回归树(CART)三类线性回归算法演示线性回归的一般使用。

Step By Step

1、简介
2、Code Demo 演示
3、优缺点

一、简介

线性回归是利用称为线性回归方程的最小二乘函数对一个或多个自变量和因变量之间关系进行建模的一种回归分析。这种函数是一个或多个称为回归系数的模型参数的线性组合。只有一个自变量的情况称为简单回归，大于一个自变量情况的叫做多元回归（multivariable linear regression）。
在线性回归中，数据使用线性预测函数来建模，并且未知的模型参数也是通过数据来估计。这些模型被叫做线性模型。最常用的线性回归建模是给定X值的y的条件均值是X的仿射函数。不太一般的情况，线性回归模型可以是一个中位数或一些其他的给定X的条件下y的条件分布的分位数作为X的线性函数表示。像所有形式的回归分析一样，线性回归也把焦点放在给定X值的y的条件概率分布，而不是X和y的联合概率分布（多元分析领域）。

图片.png

二、Code Demo 演示

2.1 OLS回归

import numpy as np
import scipy.stats as stats
from math import *
import matplotlib.pyplot as plt

from sklearn import linear_model

x = np.arange(1, 101)
x = np.array([float(i) for i in x])
y = x + [10 * sin(0.3 * i) for i in x] + stats.norm.rvs(size=100, loc=0, scale=1.5)
x = x.reshape(-1, 1)
x1 = np.c_[np.ones((100, 1)), x]
y = y.reshape(-1, 1)
m = len(x)

ols = linear_model.LinearRegression()
model = ols.fit(x, y)

y_predict = model.predict(x)
print(model.score(x, y))


# Plot the results
plt.figure()
plt.scatter(x, y, s=20, edgecolor="black", c="darkorange", label="data")
plt.plot(x, y_predict, color="cornflowerblue", label="y_predict", linewidth=2)
plt.xlabel("data")
plt.ylabel("target")
plt.title("LinearRegression OLS")
plt.legend()
plt.show()

图片.png

2.2 LWLR回归

#coding = utf-8
import numpy as np
import scipy.stats as stats
from math import *
import matplotlib.pyplot as plt


def getw(x0,x,k):
    w = np.zeros([m,m])
    for i in range(m):
        w[i, i] = exp((np.linalg.norm(x0 - x[i])) / (-2 * k ** 2))
    return w


def getyvalue(x1,x,y,k):
    y_value = np.zeros(m)
    w = np.zeros([m,m])

    for i in range(m):
        w = getw(x[i],x, k)
        theta = np.linalg.inv(x1.T.dot(w).dot(x1)).dot(x1.T).dot(w).dot(y)
        y_value[i] = theta[0] + theta[1] * x[i]
    return y_value

def LR(x,y):
    from sklearn.linear_model import LinearRegression
    lr = LinearRegression()
    lr.fit(x, y)
    y1 = lr.intercept_ + x * lr.coef_
    print(lr.coef_, lr.intercept_)
    return  lr.intercept_, lr.coef_


if __name__ == "__main__":
    x = np.arange(1, 101)
    x = np.array([float(i) for i in x])
    y = x + [10 * sin(0.3 * i) for i in x] + stats.norm.rvs(size=100, loc=0, scale=1.5)

    x = x.reshape(-1, 1)
    x1 = np.c_[np.ones((100, 1)), x]
    y = y.reshape(-1, 1)
    m = len(x)

    y_lwlr = np.zeros(m)
    y_lwlr = getyvalue(x1, x, y, 1)
    a, b = LR(x, y)
    y_lr = a + b*x
    plt.figure(figsize=(12, 6))
    plt.scatter(x, y)
    plt.plot(x, y_lwlr, 'r', label="lwlr")
    plt.plot(x, y_lr, 'y', label="ols")
    plt.xlabel("data")
    plt.ylabel("target")
    plt.legend()
    plt.show()

图片.png

2.3 CART回归

import numpy as np
import scipy.stats as stats
from math import *
import matplotlib.pyplot as plt

from sklearn import linear_model
from sklearn.tree import DecisionTreeRegressor

x = np.arange(1, 101)
x = np.array([float(i) for i in x])
y = x + [10 * sin(0.3 * i) for i in x] + stats.norm.rvs(size=100, loc=0, scale=1.5)
x = x.reshape(-1, 1)
x1 = np.c_[np.ones((100, 1)), x]
y = y.reshape(-1, 1)
m = len(x)


# Fit regression model
regr_1 = DecisionTreeRegressor(max_depth=2, presort=False)
regr_2 = DecisionTreeRegressor(max_depth=5, presort=False)
regr_1.fit(x, y)
regr_2.fit(x, y)


y_1 = regr_1.predict(x)
y_2 = regr_2.predict(y)

# Plot the results
plt.figure()
plt.scatter(x, y, s=20, edgecolor="black", c="darkorange", label="data")
plt.plot(x, y_1, color="cornflowerblue", label="Decision Tree Depth=2", linewidth=2)
plt.plot(x, y_2, color="yellowgreen", label="Decision Tree Depth=5", linewidth=2)
plt.xlabel("data")
plt.ylabel("target")
plt.title("CART Regression")
plt.legend()
plt.show()

图片.png

三、优缺点

优点

建模速度快，不需要很复杂的计算，在数据量大的情况下依然运行速度很快；
可以根据系数给出每个变量的理解和解释。

缺点

不能很好地拟合非线性数据。所以需要先判断变量之间是否是线性关系。

Part7机器学习实战学习笔记Regression

Step By Step

一、简介

二、Code Demo 演示

三、优缺点

更多参考

云服务技术课堂

热门文章

最新文章

相关课程

相关电子书

相关实验场景

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

Part7__机器学习实战学习笔记__Regression

Step By Step

一、简介

二、Code Demo 演示

三、优缺点

更多参考

云服务技术课堂

热门文章

最新文章

相关课程

相关电子书

相关实验场景

Part7机器学习实战学习笔记Regression