# Stanford coursera Andrew Ng 机器学习课程编程作业（Exercise 2）及总结

Exercise 1：Linear Regression---实现一个线性回归

Exercise 2：Logistic Regression---实现一个逻辑回归

34.62365962451697, 78.0246928153624,  0
30.28671076822607, 43.89499752400101, 0
35.84740876993872, 72.90219802708364, 0
60.18259938620976, 86.30855209546826, 1
....
....
....

function plotData(X, y)
%PLOTDATA Plots the data points X and y into a new figure
%   PLOTDATA(x,y) plots the data points with + for the positive examples
%   and o for the negative examples. X is assumed to be a Mx2 matrix.
% Create New Figure

figure; hold on;

% ====================== YOUR CODE HERE ======================
% Instructions: Plot the positive and negative examples on a
%               2D plot, using the option 'k+' for the positive
%               examples and 'ko' for the negative examples.
%

pos = find(y==1);
neg = find(y==0);
plot(X(pos, 1), X(pos, 2), 'k+', 'LineWidth', 2, 'MarkerSize', 7);
plot(X(neg, 1), X(neg, 2), 'ko', 'MarkerFaceColor', 'y', 'MarkerSize', 7);
% =========================================================================

hold off;
end

Matlab加载数据：

%% Load Data
%  The first two columns contains the exam scores and the third column
%  contains the label.

X = data(:, [1, 2]); y = data(:, 3);% 矩阵 X 取数据的所有行的第一列和第二列，向量 y 取数据的第三列

plotData(X, y);

% Put some labels
hold on;
% Labels and Legend
xlabel('Exam 1 score') %标记图形的 X 轴
ylabel('Exam 2 score') %标记图形的 Y 轴

% Specified in plot order
hold off;

①sigmoid function

sigmoid function 用Matlab 实现如下：

function g = sigmoid(z)
%SIGMOID Compute sigmoid functoon
%   J = SIGMOID(z) computes the sigmoid of z.

% You need to return the following variables correctly
g = zeros(size(z));

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the sigmoid of each value of z (z can be a matrix,
%               vector or scalar).

g = 1./(ones(size(z)) + exp(-z)); % ‘点除’ 表示 1 除以矩阵(向量)中的每一个元素

% =============================================================

end

②模型的代价函数(cost function)

J(theta)可用向量表示成：

J = ( log( sigmoid(theta'*X') ) * y + log( 1-sigmoid(theta'*X') ) * (1 - y) )/(-m);

③梯度下降算法

grad = ( X' * ( sigmoid(X*theta)-y ) )/m; % X 为 training set 中的 feature variables, y 为training instance(训练样本的结果)结果

function [J, grad] = costFunction(theta, X, y)
%COSTFUNCTION Compute cost and gradient for logistic regression
%   J = COSTFUNCTION(theta, X, y) computes the cost of using theta as the
%   parameter for logistic regression and the gradient of the cost
%   w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly
J = 0;

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%
% Note: grad should have the same dimensions as theta
%

%J = (log(theta'*X')*y + (1-y)*log(1-theta'*X'))/(-m);
%attention matlab's usage
J = ( log( sigmoid(theta'*X') ) * y + log( 1-sigmoid(theta'*X') ) * (1 - y) )/(-m);

% theta = theta - (alpha/m)*X'*(X*theta-y);
grad = ( X' * ( sigmoid(X*theta)-y ) )/m;

% =============================================================

end

%% ============= Part 3: Optimizing using fminunc  =============
%  In this exercise, you will use a built-in function (fminunc) to find the
%  optimal parameters theta.

%  Set options for fminunc
options = optimset('GradObj', 'on', 'MaxIter', 400);

%  Run fminunc to obtain the optimal theta
%  This function will return theta and the cost
[theta, cost] = ...
fminunc(@(t)(costFunction(t, X, y)), initial_theta, options);

④模型的评估（Evaluating logistic regression）

%% ============== Part 4: Predict and Accuracies ==============
%  After learning the parameters, you'll like to use it to predict the outcomes
%  on unseen data. In this part, you will use the logistic regression model
%  to predict the probability that a student with score 45 on exam 1 and
%  score 85 on exam 2 will be admitted.
%
%  Furthermore, you will compute the training and test set accuracies of
%  our model.
%

%  Predict probability for a student with score 45 on exam 1
%  and score 85 on exam 2

prob = sigmoid([1 45 85] * theta); %这是一组测试数据，第一次考试成绩为45，第二次成绩为85
fprintf(['For a student with scores 45 and 85, we predict an admission ' ...
'probability of %f\n\n'], prob);

% Compute accuracy on our training set
p = predict(theta, X);% 调用predict函数测试模型

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);

fprintf('\nProgram paused. Press enter to continue.\n');
pause;

For a student with scores 45 and 85, we predict an admission probability of 0.774323

Train Accuracy: 89.000000

function p = predict(theta, X)
%PREDICT Predict whether the label is 0 or 1 using learned logistic
%regression parameters theta
%   p = PREDICT(theta, X) computes the predictions for X using a
%   threshold at 0.5 (i.e., if sigmoid(theta'*x) >= 0.5, predict 1)

m = size(X, 1); % Number of training examples

% You need to return the following variables correctly
p = zeros(m, 1);

% ====================== YOUR CODE HERE ======================
% Instructions: Complete the following code to make predictions using
%               your learned logistic regression parameters.
%               You should set p to a vector of 0's and 1's
%
p = X*theta >= 0;

% =========================================================================

end

⑤逻辑回归的正则化（Regularized logistic regression）

lambda(λ)==1时，训练出来的模型（hypothesis function）如下：Train Accuracy: 83.050847

lambda(λ)==0时，不使用正则化，训练出来的模型（hypothesis function）如下：Train Accuracy: 87.288136

lambda(λ)==100时，训练出来的模型（hypothesis function）如下：Train Accuracy: 61.016949

Matlab正则化代价函数的实现文件costFunctionReg.m如下：

function [J, grad] = costFunctionReg(theta, X, y, lambda)
%COSTFUNCTIONREG Compute cost and gradient for logistic regression with regularization
%   J = COSTFUNCTIONREG(theta, X, y, lambda) computes the cost of using
%   theta as the parameter for regularized logistic regression and the
%   gradient of the cost w.r.t. to the parameters.

% Initialize some useful values
m = length(y); % number of training examples

% You need to return the following variables correctly
J = 0;

% ====================== YOUR CODE HERE ======================
% Instructions: Compute the cost of a particular choice of theta.
%               You should set J to the cost.
%               Compute the partial derivatives and set grad to the partial
%               derivatives of the cost w.r.t. each parameter in theta
%J = ( log( sigmoid(theta'*X') ) * y + log( 1-sigmoid(theta'*X') ) * (1 - y) )/(-m);
%J = ( log( sigmoid(theta'*X') ) * y + log( 1-sigmoid(theta'*X') ) * (1 - y) )/(-m) + (lambda / (2*m)) * (theta'*theta);
J = ( log( sigmoid(theta'*X') ) * y + log( 1-sigmoid(theta'*X') ) * (1 - y) )/(-m) + (lambda / (2*m)) * ( ( theta( 2:length(theta) ) )' * theta(2:length(theta)) );
%grad = ( X' * ( sigmoid(X*theta)-y ) )/m;
grad = ( X' * ( sigmoid(X*theta)-y ) )/m + ( lambda / m ) * ( [0; ones( length(theta) - 1 , 1 )].*theta );

% =============================================================

end

%% ============= Part 2: Regularization and Accuracies =============
%  Optional Exercise:
%  In this part, you will get to try different values of lambda and
%  see how regularization affects the decision coundart
%
%  Try the following values of lambda (0, 1, 10, 100).
%
%  How does the decision boundary change when you vary lambda? How does
%  the training set accuracy vary?
%

% Initialize fitting parameters
initial_theta = zeros(size(X, 2), 1);

% Set regularization parameter lambda to 1 (you should vary this)
lambda = 1;

% Set Options
options = optimset('GradObj', 'on', 'MaxIter', 400);

% Optimize
[theta, J, exit_flag] = ...
fminunc(@(t)(costFunctionReg(t, X, y, lambda)), initial_theta, options);

% Plot Boundary
plotDecisionBoundary(theta, X, y);
hold on;
title(sprintf('lambda = %g', lambda))

% Labels and Legend
xlabel('Microchip Test 1')
ylabel('Microchip Test 2')

legend('y = 1', 'y = 0', 'Decision boundary')
hold off;

% Compute accuracy on our training set
p = predict(theta, X);

fprintf('Train Accuracy: %f\n', mean(double(p == y)) * 100);

⑥总结：

|
1月前
|

32 0
|
1月前
|

49 1
|
1月前
|

LabVIEW使用机器学习分类模型探索基于技能课程的学习
LabVIEW使用机器学习分类模型探索基于技能课程的学习
22 1
|
1月前
|

Coursera 吴恩达Machine Learning(机器学习)课程 |第五周测验答案(仅供参考)
Coursera 吴恩达Machine Learning(机器学习)课程 |第五周测验答案(仅供参考)
51 0
|
11月前
|

Python编程入门基础及高级技能、Web开发、数据分析和机器学习与人工智能
Python编程入门基础及高级技能、Web开发、数据分析和机器学习与人工智能
142 0
|

153 0
|
1月前
|

121 14
|
1月前
|

Machine Learning机器学习之决策树算法 Decision Tree（附Python代码）
Machine Learning机器学习之决策树算法 Decision Tree（附Python代码）
190 0
|
1月前
|

43 1
|
1月前
|

182 0