第三周编程作业-Planar data classification with one hidden layer(二)

简介: 第三周编程作业-Planar data classification with one hidden layer(二)

4.4 - Integrate parts 4.1, 4.2 and 4.3 in nn_model()


Question: Build your neural network model in nn_model().

Instructions: The neural network model has to use the previous functions in the right order.


# GRADED FUNCTION: nn_model
def nn_model(X, Y, n_h, num_iterations = 10000, print_cost=False):
    """
    Arguments:
    X -- dataset of shape (2, number of examples)
    Y -- labels of shape (1, number of examples)
    n_h -- size of the hidden layer
    num_iterations -- Number of iterations in gradient descent loop
    print_cost -- if True, print the cost every 1000 iterations
    Returns:
    parameters -- parameters learnt by the model. They can then be used to predict.
    """
    np.random.seed(3)
    n_x = layer_sizes(X, Y)[0]
    n_y = layer_sizes(X, Y)[2]
    # Initialize parameters, then retrieve W1, b1, W2, b2. Inputs: "n_x, n_h, n_y". Outputs = "W1, b1, W2, b2, parameters".
    ### START CODE HERE ### (≈ 5 lines of code)
    parameters = initialize_parameters(n_x,n_h,n_y)
    W1 = parameters["W1"]
    b1 = parameters["b1"]
    W2 = parameters["W2"]
    b2 = parameters["b2"]
    ### END CODE HERE ###
    # Loop (gradient descent)
    for i in range(0, num_iterations):
        ### START CODE HERE ### (≈ 4 lines of code)
        # Forward propagation. Inputs: "X, parameters". Outputs: "A2, cache".
        A2, cache =forward_propagation(X, parameters)
        # Cost function. Inputs: "A2, Y, parameters". Outputs: "cost".
        cost = compute_cost(A2, Y, parameters)
        # Backpropagation. Inputs: "parameters, cache, X, Y". Outputs: "grads".
        grads = backward_propagation(parameters, cache, X, Y)
        # Gradient descent parameter update. Inputs: "parameters, grads". Outputs: "parameters".
        parameters = update_parameters(parameters, grads)
        ### END CODE HERE ###
        # Print the cost every 1000 iterations
        if print_cost and i % 1000 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))
    return parameters


X_assess, Y_assess = nn_model_test_case()
parameters = nn_model(X_assess, Y_assess, 4, num_iterations=10000, print_cost=False)
print("W1 = " + str(parameters["W1"]))
print("b1 = " + str(parameters["b1"]))
print("W2 = " + str(parameters["W2"]))
print("b2 = " + str(parameters["b2"]))


/opt/conda/lib/python3.5/site-packages/ipykernel/__main__.py:20: RuntimeWarning: divide by zero encountered in log
/home/jovyan/work/Week 3/Planar data classification with one hidden layer/planar_utils.py:34: RuntimeWarning: overflow encountered in exp
  s = 1/(1+np.exp(-x))
W1 = [[-4.18494056  5.33220609]
 [-7.52989382  1.24306181]
 [-4.1929459   5.32632331]
 [ 7.52983719 -1.24309422]]
b1 = [[ 2.32926819]
 [ 3.79458998]
 [ 2.33002577]
 [-3.79468846]]
W2 = [[-6033.83672146 -6008.12980822 -6033.10095287  6008.06637269]]
b2 = [[-52.66607724]]


Expected Output:

W1 [[-4.18494056 5.33220609]

[-7.52989382 1.24306181]

[-4.1929459 5.32632331]

[ 7.52983719 -1.24309422]]

b1 [[ 2.32926819]

[ 3.79458998]

[ 2.33002577]

[-3.79468846]]

W2 [[-6033.83672146 -6008.12980822 -6033.10095287 6008.06637269]]
b2

[[-52.66607724]]


4.5 Predictions


Question: Use your model to predict by building predict().

Use forward propagation to predict results.


Reminder: predictions = $y_{prediction} = \mathbb 1 \text{{activation > 0.5}} = \begin{cases}

1 & \text{if}\ activation > 0.5 \

0 & \text{otherwise}

\end{cases}$

As an example, if you would like to set the entries of a matrix X to 0 and 1 based on a threshold you would do: X_new = (X > threshold)


# GRADED FUNCTION: predict
def predict(parameters, X):
    """
    Using the learned parameters, predicts a class for each example in X
    Arguments:
    parameters -- python dictionary containing your parameters 
    X -- input data of size (n_x, m)
    Returns
    predictions -- vector of predictions of our model (red: 0 / blue: 1)
    """
    # Computes probabilities using forward propagation, and classifies to 0/1 using 0.5 as the threshold.
    ### START CODE HERE ### (≈ 2 lines of code)
    A2, cache = forward_propagation(X, parameters)
    predictions = (A2>0.5)
    ### END CODE HERE ###
    return predictions


parameters, X_assess = predict_test_case()
predictions = predict(parameters, X_assess)
print("predictions mean = " + str(np.mean(predictions)))


predictions mean = 0.666666666667


Expected Output:

predictions mean 0.666666666667

It is time to run the model and see how it performs on a planar dataset. Run the following code to test your model with a single hidden layer of $n_h$ hidden units.


# Build a model with a n_h-dimensional hidden layer
parameters = nn_model(X, Y, n_h = 4, num_iterations = 10000, print_cost=True)
# Plot the decision boundary
plot_decision_boundary(lambda x: predict(parameters, x.T), X, Y)
plt.title("Decision Boundary for hidden layer size " + str(4))


Cost after iteration 0: 0.693048
Cost after iteration 1000: 0.288083
Cost after iteration 2000: 0.254385
Cost after iteration 3000: 0.233864
Cost after iteration 4000: 0.226792
Cost after iteration 5000: 0.222644
Cost after iteration 6000: 0.219731
Cost after iteration 7000: 0.217504
Cost after iteration 8000: 0.219454
Cost after iteration 9000: 0.218607
<matplotlib.text.Text at 0x7f1b55d40b38>


3.png

output_50_2.png

Expected Output:

Cost after iteration 9000 0.218607


# Print accuracy
predictions = predict(parameters, X)
print ('Accuracy: %d' % float((np.dot(Y,predictions.T) + np.dot(1-Y,1-predictions.T))/float(Y.size)*100) + '%')


Accuracy: 90%


Expected Output:

Accuracy 90%

Accuracy is really high compared to Logistic Regression. The model has learnt the leaf patterns of the flower! Neural networks are able to learn even highly non-linear decision boundaries, unlike logistic regression.

Now, let's try out several hidden layer sizes.


4.6 - Tuning hidden layer size (optional/ungraded exercise)


Run the following code. It may take 1-2 minutes. You will observe different behaviors of the model for various hidden layer sizes.


# This may take about 2 minutes to run
plt.figure(figsize=(16, 32))
hidden_layer_sizes = [1, 2, 3, 4, 5, 20, 50]
for i, n_h in enumerate(hidden_layer_sizes):
    plt.subplot(5, 2, i+1)
    plt.title('Hidden Layer of size %d' % n_h)
    parameters = nn_model(X, Y, n_h, num_iterations = 5000)
    plot_decision_boundary(lambda x: predict(parameters, x.T), X, Y)
    predictions = predict(parameters, X)
    accuracy = float((np.dot(Y,predictions.T) + np.dot(1-Y,1-predictions.T))/float(Y.size)*100)
    print ("Accuracy for {} hidden units: {} %".format(n_h, accuracy))


Accuracy for 1 hidden units: 67.5 %
Accuracy for 2 hidden units: 67.25 %
Accuracy for 3 hidden units: 90.75 %
Accuracy for 4 hidden units: 90.5 %
Accuracy for 5 hidden units: 91.25 %
Accuracy for 20 hidden units: 90.0 %
Accuracy for 50 hidden units: 90.25 %


4.png


output_56_1.png

Interpretation:

  • The larger models (with more hidden units) are able to fit the training set better, until eventually the largest models overfit the data.
  • The best hidden layer size seems to be around n_h = 5. Indeed, a value around here seems to  fits the data well without also incurring noticable overfitting.
  • You will also learn later about regularization, which lets you use very large models (such as n_h = 50) without much overfitting.


Optional questions:

Note: Remember to submit the assignment but clicking the blue "Submit Assignment" button at the upper-right.

Some optional/ungraded questions that you can explore if you wish:

  • What happens when you change the tanh activation for a sigmoid activation or a ReLU activation?
  • Play with the learning_rate. What happens?
  • What if we change the dataset? (See part 5 below!)

You've learnt to:

  • Build a complete neural network with a hidden layer
  • Make a good use of a non-linear unit
  • Implemented forward propagation and backpropagation, and trained a neural network
  • See the impact of varying the hidden layer size, including overfitting.

Nice work!


5) Performance on other datasets


If you want, you can rerun the whole notebook (minus the dataset part) for each of the following datasets.


# Datasets
noisy_circles, noisy_moons, blobs, gaussian_quantiles, no_structure = load_extra_datasets()
datasets = {"noisy_circles": noisy_circles,
            "noisy_moons": noisy_moons,
            "blobs": blobs,
            "gaussian_quantiles": gaussian_quantiles}
### START CODE HERE ### (choose your dataset)
dataset = "noisy_moons"
### END CODE HERE ###
X, Y = datasets[dataset]
X, Y = X.T, Y.reshape(1, Y.shape[0])
# make blobs binary
if dataset == "blobs":
    Y = Y%2
# Visualize the data
plt.scatter(X[0, :], X[1, :], c=Y, s=40, cmap=plt.cm.Spectral);


5.png

output_63_0.png

Congrats on finishing this Programming Assignment!

Reference:

相关文章
|
6月前
|
数据采集 自然语言处理 数据可视化
Hidden Markov Model,简称 HMM
隐马尔可夫模型(Hidden Markov Model,简称 HMM)是一种统计模型,用于描述由隐藏的马尔可夫链随机生成观测序列的过程。它是一种生成模型,可以通过学习模型参数来预测观测序列的未来状态。HMM 主要包括以下几个步骤:
56 5
|
9天前
|
vr&ar
R语言如何做马尔可夫转换模型markov switching model
R语言如何做马尔可夫转换模型markov switching model
|
10天前
|
vr&ar
R语言如何做马尔科夫转换模型markov switching model
R语言如何做马尔科夫转换模型markov switching model
16 0
|
8月前
|
机器学习/深度学习 算法 网络架构
少样本学习系列(二)【Model-Based Methods】
少样本学习系列(二)【Model-Based Methods】
|
11月前
|
PyTorch 算法框架/工具
Pytorch疑难小实验:Torch.max() Torch.min()在不同维度上的解释
Pytorch疑难小实验:Torch.max() Torch.min()在不同维度上的解释
120 0
|
机器学习/深度学习 PyTorch 算法框架/工具
model是一个模型网络,model.eval() 、model.train()是什么意思?
在PyTorch中,model.eval()是一个模型对象的方法,用于将模型设置为评估模式。当模型处于评估模式时,它会在前向传递期间禁用某些操作,如丢弃(dropout)和批量归一化(batch normalization),以确保模型的输出稳定性。
614 0
|
机器学习/深度学习 Python
机器学习: Label vs. One Hot Encoder
机器学习: Label vs. One Hot Encoder
124 0
|
机器学习/深度学习 算法 搜索推荐
cs224w(图机器学习)2021冬季课程学习笔记4 Link Analysis: PageRank (Graph as Matrix)
cs224w(图机器学习)2021冬季课程学习笔记4 Link Analysis: PageRank (Graph as Matrix)
cs224w(图机器学习)2021冬季课程学习笔记4 Link Analysis: PageRank (Graph as Matrix)
|
资源调度 数据挖掘 关系型数据库
第三周编程作业-Planar data classification with one hidden layer(一)
第三周编程作业-Planar data classification with one hidden layer(一)
314 0
第三周编程作业-Planar data classification with one hidden layer(一)
|
资源调度
第四周编程作业(一)-Building your Deep Neural Network: Step by Step(二)
第四周编程作业(一)-Building your Deep Neural Network: Step by Step(二)
114 0