You will learn how to:

- Implement a 2-class classification neural network with a single hidden layer
- Use units with a non-linear activation function, such as tanh
- Compute the cross entropy loss
- Implement forward and backward propagation

# Packages

Let’s first import all the packages that you will need during this assignment.

**numpy**is the fundamental package for scientific computing with Python.**sklearn**provides simple and efficient tools for data mining and data analysis.**matplotlib**is a library for plotting graphs in Python.**testCases**provides some test examples to assess the correctness of your functions.**planar_utils**provide various useful functions used in this assignment.

1 | # Package imports |

# Dataset

First, let’s get the dataset you will work on. The following code will load a “flower” 2-class dataset into variables X and Y.

1 | X, Y = load_planar_dataset() |

Visualize the dataset using matplotlib. The data looks like a “flower” with some red (label y=0) and some blue (y=1) points. Your goal is to build a model to fit this data.

1 | # Visualize the data: |

How many training examples do you have? In addition, what is the shape of the variables X and Y?

1 | ### START CODE HERE ### (≈ 3 lines of code) |

# Simple Logistic Regression

Before building a full neural network, lets first see how logistic regression performs on this problem. You can use **sklearn’s built-in functions** to do that.

1 | # Train the logistic regression classifier |

You can now plot the decision boundary of these models.

1 | # Plot the decision boundary for logistic regression |

**The dataset is not linearly separable**, so logistic regression doesn’t perform well. Hopefully a neural network will do better. Let’s try this now!

# Neural Network model

Logistic regression did not work well on the “flower dataset”. You are going to train a Neural Network with a single hidden layer.

The general methodology to build a Neural Network is to:

- Define the neural network structure ( # of input units, # of hidden units, etc).
- Initialize the model’s parameters

Loop:- Implement forward propagation
- Compute loss
- Implement backward propagation to get the gradients
- Update parameters (gradient descent)

## Defining the neural network structure

Define three cariables:

- n_x: the size of the input layer
- n_h: the size of the hidden layer (set this to 4)
- n_y: the size of the output layer

1 | # GRADED FUNCTION: layer_sizes |

## Initialize the model’s parameters

- Make sure your parameters’ sizes are right. Refer to the neural network figure above if needed.
- You will initialize the weights matrices with random values.
- Use: np.random.randn(a,b) * 0.01 to randomly initialize a matrix of shape (a,b).

- You will initialize the bias vectors as zeros.
- Use: np.zeros((a,b)) to initialize a matrix of shape (a,b) with zeros.

1 | # GRADED FUNCTION: initialize_parameters |

## The Loop

- Look above at the mathematical representation of your classifier.
- You can use the function sigmoid(). It is built-in (imported) in the notebook.
- You can use the function np.tanh(). It is part of the numpy library.
- The steps you have to implement are:
- Retrieve each parameter from the dictionary “parameters” (which is the output of initialize_parameters()) by using parameters[“..”].
- Implement Forward Propagation. Compute Z[1],A[1],Z[2] and A[2] (the vector of all your predictions on all the examples in the training set).

- Values needed in the backpropagation are stored in “cache”. The cache will be given as an input to the backpropagation function.

1 | # GRADED FUNCTION: forward_propagation |

Implement compute_cost() to compute the value of the cost J.

1 | # GRADED FUNCTION: compute_cost |

Using the cache computed during forward propagation, you can now implement backward propagation.

Implement the function backward_propagation().

Backpropagation is usually the hardest (most mathematical) part in deep learning. To help you, here again is the slide from the lecture on backpropagation. You’ll want to use the six equations on the right of this slide, since you are building a vectorized implementation.

1 | # GRADED FUNCTION: backward_propagation |

Implement the update rule. Use gradient descent. You have to use (dW1, db1, dW2, db2) in order to update (W1, b1, W2, b2).

**General gradient descent rule**: θ=θ−α*(∂J/∂θ) where α is the learning rate and θ represents a parameter.

1 | # GRADED FUNCTION: update_parameters |

## Integrate parts 4.1, 4.2 and 4.3 in nn_model()

Build your neural network model in nn_model(). The neural network model has to use the previous functions in the right order.

1 | # GRADED FUNCTION: nn_model |

## Predictions

1 | # GRADED FUNCTION: predict |

It is time to run the model and see how it performs on a planar dataset. Run the following code to test your model with a single hidden layer of $n_h$ hidden units.

1 | # Build a model with a n_h-dimensional hidden layer |

1 | # Print accuracy |

Accuracy is really high compared to Logistic Regression. The model has learnt the leaf patterns of the flower! Neural networks are able to learn even highly non-linear decision boundaries, unlike logistic regression.

## Tuning hidden layer size

Run the following code. It may take 1-2 minutes. You will observe different behaviors of the model for various hidden layer sizes.

1 | # This may take about 2 minutes to run |

# Performance on other datasets

If you want, you can rerun the whole notebook (minus the dataset part) for each of the following datasets.

1 | # Datasets |

# Conclusion

You’ve learnt to:

- Build a complete neural network with a hidden layer
- Make a good use of a non-linear unit
- Implemented forward propagation and backpropagation, and trained a neural network
- See the impact of varying the hidden layer size, including overfitting.