第二周编程作业 -Logistic Regression with a Neural Network mindset(三)

简介: 第二周编程作业 -Logistic Regression with a Neural Network mindset(三)

5 - Merge all functions into a model


You will now see how the overall model is structured by putting together all the building blocks (functions implemented in the previous parts) together, in the right order.


Exercise: Implement the model function. Use the following notation:

- Y_prediction for your predictions on the test set

- Y_prediction_train for your predictions on the train set

- w, costs, grads for the outputs of optimize()

# GRADED FUNCTION: model
def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
    """
    Builds the logistic regression model by calling the function you've implemented previously
    Arguments:
    X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)
    Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)
    X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)
    Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)
    num_iterations -- hyperparameter representing the number of iterations to optimize the parameters
    learning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()
    print_cost -- Set to true to print the cost every 100 iterations
    Returns:
    d -- dictionary containing information about the model.
    """
    ### START CODE HERE ###
    # initialize parameters with zeros (≈ 1 line of code)
    w, b = initialize_with_zeros(X_train.shape[0])
    # Gradient descent (≈ 1 line of code)
    parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)
    # Retrieve parameters w and b from dictionary "parameters"
    w = parameters["w"]
    b = parameters["b"]
    # Predict test/train set examples (≈ 2 lines of code)
    Y_prediction_test = predict(w, b, X_test)
    Y_prediction_train = predict(w, b, X_train)
    ### END CODE HERE ###
    # Print train/test Errors
    print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
    print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
    d = {"costs": costs,
         "Y_prediction_test": Y_prediction_test, 
         "Y_prediction_train" : Y_prediction_train, 
         "w" : w, 
         "b" : b,
         "learning_rate" : learning_rate,
         "num_iterations": num_iterations}
    return d


Run the following cell to train your model.

d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)

Cost after iteration 0: 0.693147
Cost after iteration 100: 0.584508
Cost after iteration 200: 0.466949
Cost after iteration 300: 0.376007
Cost after iteration 400: 0.331463
Cost after iteration 500: 0.303273
Cost after iteration 600: 0.279880
Cost after iteration 700: 0.260042
Cost after iteration 800: 0.242941
Cost after iteration 900: 0.228004
Cost after iteration 1000: 0.214820
Cost after iteration 1100: 0.203078
Cost after iteration 1200: 0.192544
Cost after iteration 1300: 0.183033
Cost after iteration 1400: 0.174399
Cost after iteration 1500: 0.166521
Cost after iteration 1600: 0.159305
Cost after iteration 1700: 0.152667
Cost after iteration 1800: 0.146542
Cost after iteration 1900: 0.140872
train accuracy: 99.04306220095694 %
test accuracy: 70.0 %


Expected Output:

<table style="width:40%">

<tr>
    <td> **Train Accuracy**  </td> 
    <td> 99.04306220095694 % </td>
</tr>
<tr>
    <td>**Test Accuracy** </td> 
    <td> 70.0 % </td>
</tr>

</table>


Comment: Training accuracy is close to 100%. This is a good sanity check: your model is working and has high enough capacity to fit the training data. Test error is 68%. It is actually not bad for this simple model, given the small dataset we used and that logistic regression is a linear classifier. But no worries, you'll build an even better classifier next week!


Also, you see that the model is clearly overfitting the training data. Later in this specialization you will learn how to reduce overfitting, for example by using regularization. Using the code below (and changing the index variable) you can look at predictions on pictures of the test set.

# Example of a picture that was wrongly classified.
index = 1
plt.imshow(test_set_x[:,index].reshape((num_px, num_px, 3)))
print ("y = " + str(test_set_y[0,index]) + ", you predicted that it is a \"" + classes[d["Y_prediction_test"][0,index]].decode("utf-8") +  "\" picture.")

/opt/conda/lib/python3.5/site-packages/ipykernel/__main__.py:4: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
y = 1, you predicted that it is a "cat" picture.


31.png

output_44_2.png

Let's also plot the cost function and the gradients.

# Plot learning curve (with costs)
costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()


32.png

output_46_0.png


Interpretation:

You can see the cost decreasing. It shows that the parameters are being learned. However, you see that you could train the model even more on the training set. Try to increase the number of iterations in the cell above and rerun the cells. You might see that the training set accuracy goes up, but the test set accuracy goes down. This is called overfitting.


6 - Further analysis (optional/ungraded exercise)


Congratulations on building your first image classification model. Let's analyze it further, and examine possible choices for the learning rate $\alpha$.

Choice of learning rate


Reminder:

In order for Gradient Descent to work you must choose the learning rate wisely. The learning rate $\alpha$  determines how rapidly we update the parameters. If the learning rate is too large we may "overshoot" the optimal value. Similarly, if it is too small we will need too many iterations to converge to the best values. That's why it is crucial to use a well-tuned learning rate.

Let's compare the learning curve of our model with several choices of learning rates. Run the cell below. This should take about 1 minute. Feel free also to try different values than the three we have initialized the learning_rates variable to contain, and see what happens.

learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:
    print ("learning rate is: " + str(i))
    models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)
    print ('\n' + "-------------------------------------------------------" + '\n')
for i in learning_rates:
    plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))
plt.ylabel('cost')
plt.xlabel('iterations')
legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()

learning rate is: 0.01
train accuracy: 99.52153110047847 %
test accuracy: 68.0 %
-------------------------------------------------------
learning rate is: 0.001
train accuracy: 88.99521531100478 %
test accuracy: 64.0 %
-------------------------------------------------------
learning rate is: 0.0001
train accuracy: 68.42105263157895 %
test accuracy: 36.0 %
-------------------------------------------------------


33.png

output_50_1.png


Interpretation:

  • Different learning rates give different costs and thus different predictions results.
  • If the learning rate is too large (0.01), the cost may oscillate up and down. It may even diverge (though in this example, using 0.01 still eventually ends up at a good value for the cost).
  • A lower cost doesn't mean a better model. You have to check if there is possibly overfitting. It happens when the training accuracy is a lot higher than the test accuracy.
  • In deep learning, we usually recommend that you:
  • Choose the learning rate that better minimizes the cost function.
  • If your model overfits, use other techniques to reduce overfitting. (We'll talk about this in later videos.)


7 - Test with your own image (optional/ungraded exercise)


Congratulations on finishing this assignment. You can use your own image and see the output of your model. To do that:

1. Click on "File" in the upper bar of this notebook, then click "Open" to go on your Coursera Hub.

2. Add your image to this Jupyter Notebook's directory, in the "images" folder

3. Change your image's name in the following code

4. Run the code and check if the algorithm is right (1 = cat, 0 = non-cat)!

## START CODE HERE ## (PUT YOUR IMAGE NAME) 
my_image = "my_image.jpg"   # change this to the name of your image file 
## END CODE HERE ##
# We preprocess the image to fit your algorithm.
fname = "images/" + my_image
image = np.array(ndimage.imread(fname, flatten=False))
my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((1, num_px*num_px*3)).T
my_predicted_image = predict(d["w"], d["b"], my_image)
plt.imshow(image)
print("y = " + str(np.squeeze(my_predicted_image)) + ", your algorithm predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")

y = 0.0, your algorithm predicts a "non-cat" picture.


34.png

output_53_1.png

<font color='blue'>


What to remember from this assignment:

  1. Preprocessing the dataset is important.
  2. You implemented each function separately: initialize(), propagate(), optimize(). Then you built a model().
  3. Tuning the learning rate (which is an example of a "hyperparameter") can make a big difference to the algorithm. You will see more examples of this later in this course!


Finally, if you'd like, we invite you to try different things on this Notebook. Make sure you submit before trying anything. Once you submit, things you can play with include:

- Play with the learning rate and the number of iterations

- Try different initialization methods and compare the results

- Test other preprocessings (center the data, or divide each row by its standard deviation)

Bibliography:

相关文章
|
机器学习/深度学习 自然语言处理 算法
【论文泛读】 知识蒸馏:Distilling the knowledge in a neural network
【论文泛读】 知识蒸馏:Distilling the knowledge in a neural network
【论文泛读】 知识蒸馏:Distilling the knowledge in a neural network
|
机器学习/深度学习 算法
瞎聊机器学习——LR(Logistic Regression)逻辑斯蒂回归(一)
瞎聊机器学习——LR(Logistic Regression)逻辑斯蒂回归(一)
瞎聊机器学习——LR(Logistic Regression)逻辑斯蒂回归(一)
|
机器学习/深度学习 算法 数据挖掘
周志华《Machine Learning》学习笔记(12)--降维与度量学习
样本的特征数称为维数(dimensionality),当维数非常大时,也就是现在所说的“维数灾难”,具体表现在:在高维情形下,数据样本将变得十分稀疏
213 0
周志华《Machine Learning》学习笔记(12)--降维与度量学习
|
机器学习/深度学习 人工智能 搜索推荐
【推荐系统论文精读系列】(十一)--DeepFM A Factorization-Machine based Neural Network for CTR Prediction
在推荐系统领域最大化CTR最关键就是要学习用户举止背后复杂的特征交互。尽管现在已经有了一些大的进展,但是现存的方式仍然是只能捕捉低阶或者高阶特征,或者需要专业的特征工程。本篇论文中,我们提出了一种端到端的学习模型,能够同时学习到低阶和高阶的交互特征。我们将这个模型命名为DeepFM,它结合了分解机的能力和深度学习捕捉高阶特征的能力。对比最新谷歌提出的Wide & Deep模型,我们的DeepFM模型不需要任何特征工程,而且会共享特征输入。
259 0
|
机器学习/深度学习 自然语言处理 搜索推荐
【推荐系统论文精读系列】(十三)--Attentional Factorization Machines Learning the Weight of Feature Interactions
有监督学习在机器学习中是最基本的任务之一。它的目标就是推断出一个函数能够预测给定变量的标签。例如,实值标签对于回归问题,而分类标签用于分类问题。他已经广泛的应用于各大应用,包括推荐系统,在线广告,图像识别等。
225 0
【推荐系统论文精读系列】(十三)--Attentional Factorization Machines Learning the Weight of Feature Interactions
|
机器学习/深度学习 人工智能 搜索推荐
【推荐系统论文精读系列】(十二)--Neural Factorization Machines for Sparse Predictive Analytics
现在很多基于网站应用的预测任务都需要对类别进行建模,例如用户的ID、性别和职业等。为了使用通常的机器学习预测算法,需要将这些类别变量通过one-hot将其转化成二值特征,这就会导致合成的特征向量是高度稀疏的。为了有效学习这些稀疏数据,关键就是要解释不同特征之间的影响。
385 0
|
机器学习/深度学习 存储 搜索推荐
【推荐系统论文精读系列】(二)--Factorization Machines
本篇论文中,作者介绍了一个新的分解模型Fatorization Machines(FM),它结合了支持向量机的一些优点。与SVM一样,FM模型是一个通用的预测分类器适用于任何真实值的向量。但是与SVM不同的是,FM通过使用分解参数的方式在不同变量之间进行建模。
285 0
|
机器学习/深度学习 存储 算法
Paper:《CatBoost: unbiased boosting with categorical features》的翻译与解读
Paper:《CatBoost: unbiased boosting with categorical features》的翻译与解读
Paper:《CatBoost: unbiased boosting with categorical features》的翻译与解读
|
人工智能 移动开发 Go
第二周编程作业 -Logistic Regression with a Neural Network mindset(一)
第二周编程作业 -Logistic Regression with a Neural Network mindset(一)
228 0
第二周编程作业 -Logistic Regression with a Neural Network mindset(一)
|
人工智能 资源调度 Python
第二周编程作业 -Logistic Regression with a Neural Network mindset(二)
第二周编程作业 -Logistic Regression with a Neural Network mindset(二)
270 0