1.3 Decision boundary(决策边界)
x1 = np.linspace(30, 100, 100) # 自变量 x2 = -(final_theta[0] + x1*final_theta[1]) / final_theta[2] # 因变量 fig, ax = plt.subplots(figsize=(12,8)) ax.plot(x1, x2, 'y', label='Prediction') ax.scatter(positive['Exam 1'], positive['Exam 2'], s=50, c='b', marker='o', label='Admitted') ax.scatter(negative['Exam 1'], negative['Exam 2'], s=50, c='r', marker='x', label='Not Admitted') ax.legend() ax.set_xlabel('Exam 1 Score') ax.set_ylabel('Exam 2 Score') plt.show()
2 Regularized logistic regression
In this part of the exercise, you will implement regularized logistic regression to predict whether microchips from a fabrication plant passes quality assurance (QA). During QA, each microchip goes through various tests to ensure it is functioning correctly.
Suppose you are the product manager of the factory and you have the test results for some microchips on two dierent tests. From these two tests, you would like to determine whether the microchips should be accepted or rejected. To help you make the decision, you have a dataset of test results on past microchips, from which you can build a logistic regression model.
2.1 Visualizing the data
path = '../data_files/data/ex2data2.txt' data2 = pd.read_csv(path,header=None,names=['Microchip 1','Microchip 2','Accepted']) data2.head()
接着就画出散点图,⚪ 表示的是Accept,+ 表示的是Reject,一个是正一个是负
def plot_data(): positive = data2[data2['Accepted'].isin([1])] negative = data2[data2['Accepted'].isin([0])] fig, ax = plt.subplots(figsize=(12,8)) ax.scatter(x = positive['Microchip 1'],y = positive['Microchip 2'],c = 'black', s = 50,marker = '+',label = 'Accepted') ax.scatter(x = negative['Microchip 1'],y = negative['Microchip 2'],c = 'y', s = 50,marker = 'o',label = 'Reject') ax.legend() ax.set_xlabel('Microchip 1') ax.set_ylabel('Microchip 2') # plt.show() plot_data()
2.2 Feature mapping
One way to the data better is to create more features from each data point.We will map the features into all polynomial terms of x1 and x2 up to the sixth power.
def feature_mapping(x1, x2, power,as_ndarray = False): data = {} # for i in np.arange(power + 1): # for p in np.arange(i + 1): # data["f{}{}".format(i - p, p)] = np.power(x1, i-p)* np.power(x2,p) data = {"f'{}{}".format( i-p , p ):np.power(x1,i-p) * np.power(x2,p) for i in np.arange(power+1) for p in np.arange(i+1) } if as_ndarray: return np.array(pd.DataFrame(data)) else: return pd.DataFrame(data)
x1 = np.array(data2['Microchip 1']) x2 = np.array(data2['Microchip 2']) _data2 = feature_mapping(x1, x2, power = 6) print(_data2.shape) _data2.head()
As a result of this mapping, our vector of two features (the scores on two QA tests) has been transformed into a 28-dimensional vector. A logistic regression classier trained on this higher-dimension feature vector will have a more complex decision boundary and will appear nonlinear when drawn in our 2-dimensional plot.
While the feature mapping allows us to build a more expressive classier, it also more susceptible to overfitting. In the next parts of the exercise, you will implement regularized logistic regression to the data and also see for yourself how regularization can help combat the overfitting problem.
2.3 Cost function
其中λ 又称为正则化参数(Regularization Parameter)。 注:根据惯例,我们不对θ0进行惩罚。
def sigmoid(z): return 1 / (1 + np.exp(-z))
theta = np.zeros(_data2.shape[1]) X = feature_mapping(x1, x2, power = 6,as_ndarray = True) print(X.shape) y = np.array(data2.iloc[:,-1]) print(y.shape)
def regularized_cost(theta, X, y, l=1): thetaReg = theta[1:] first = ( -y * np.log(sigmoid(X @ theta) )) - (1-y) * np.log(1-sigmoid( X @ theta )) reg = (thetaReg @ thetaReg) * l / ( 2*len(X) ) return np.mean(first) + reg regularized_cost(theta,X,y,l=1)
2.4 Regularized Gradient
注:看上去同线性回归一样,但是由于假设h θ ( x ) = g ( θ T X ) ,所以与线性回归不同。
1. 虽然正则化的逻辑回归中的梯度下降和正则化的线性回归中的表达式看起来一样,但 由于两者的hθ0
2. θ 0
def regularized_gradient(theta, X, y, l=1): thetaReg = theta[1:] first = ( X.T @ (sigmoid(X @ theta) - y)) / len(X) # print(first) # 这里人为插入一维0,使得对theta_0不惩罚,方便计算 reg = np.concatenate([np.array([0]), (l / len(X)) * thetaReg]) # print(reg) # [8.47457627e-03 1.87880932e-02 7.77711864e-05 5.03446395e-02 # 1.15013308e-02 3.76648474e-02 1.83559872e-02 7.32393391e-03 # 8.19244468e-03 2.34764889e-02 3.93486234e-02 2.23923907e-03 # 1.28600503e-02 3.09593720e-03 3.93028171e-02 1.99707467e-02 # 4.32983232e-03 3.38643902e-03 5.83822078e-03 4.47629067e-03 # 3.10079849e-02 3.10312442e-02 1.09740238e-03 6.31570797e-03 # 4.08503006e-04 7.26504316e-03 1.37646175e-03 3.87936363e-02] # [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. # 0. 0. 0. 0.] return first + reg regularized_gradient(theta,X,y)
def cost(theta, X, y): first = (-y) * np.log(sigmoid(X @ theta)) second = (1 - y)*np.log(1 - sigmoid(X @ theta)) return np.mean(first - second)
def gradient(theta, X, y): return (X.T @ (sigmoid(X @ theta) - y))/len(X) # the gradient of the cost is a vector of the same length as θ where the jth element (for j = 0, 1, . . . , n)
def costReg(theta, X, y, l=1): # 不惩罚第一项 _theta = theta[1: ] reg = (l / (2 * len(X))) *(_theta @ _theta) # _theta@_theta == inner product return cost(theta, X, y) + reg
def gradientReg(theta, X, y, l=1): reg = (1 / len(X)) * theta reg[0] = 0 return gradient(theta, X, y) + reg
2.5 Learning θ parameters
import scipy.optimize as opt print('init cost = {}'.format(regularized_cost(theta,X,y))) #init cost = 0.6931471805599454 res = opt.minimize(fun=regularized_cost,x0=theta,args=(X,y),method='CG',jac=regularized_gradient) res
2.7 Evaluating logistic regression
def predict(theta, X): probability = sigmoid( X @ theta) return [1 if x >= 0.5 else 0 for x in probability] # return a list
final_theta = result2[0] predictions = predict(final_theta, X) correct = [1 if a==b else 0 for (a, b) in zip(predictions, y)] accuracy = sum(correct) / len(correct) accuracy
from sklearn.metrics import classification_report final_theta = res.x y_predict = predict(final_theta, X) predict(final_theta, X) print(classification_report(y,y_predict))
from sklearn import linear_model#调用sklearn的线性回归包 model = linear_model.LogisticRegression(penalty='l2', C=1.0) model.fit(X, y.ravel())
model.score(X, y) # 0.8305084745762712
2.6 Plotting the decision boundary
X × θ = 0 (this is the line)
x = np.linspace(-1, 1.5, 50) #从-1到1.5等间距取出50个数 xx, yy = np.meshgrid(x, x) #将x里的数组合成50*50=250个坐标 z = np.array(feature_mapping(xx.ravel(), yy.ravel(), 6)) z = z @ final_theta z = z.reshape(xx.shape) plot_data() plt.contour(xx, yy, z, 0, colors='black') #等高线是三维图像在二维空间的投影,0表示z的高度为0 plt.ylim(-.8, 1.2)
好了,真好,我们又完成了exp 2,争取过几天把exp 3也弄出来,加油
take control of your own desting.(命运掌握在自己手上)