Basics of Deep Learning p.13 - Implementing the Backpropagation Algorithm with NumPy

1/21/2020

0 Comments

This post is part of a series:

Part 1: Introduction
Part 2: Feedforward Algorithm explained
Part 3: Implementing the Feedforward Algorithm in pure Python
Part 4: Implementing the Feedforward Algorithm in pure Python cont'd
Part 5: Implementing the Feedforward Algorithm with NumPy
Part 6: Backpropagation explained - Cost Function and Derivatives
Part 7: Backpropagation explained - Gradient Descent and Partial Derivatives
Part 8: Backpropagation explained - Chain Rule and Activation Function
Part 9: Backpropagation explained Step by Step
Part 10: Backpropagation explained Step by Step cont'd
Part 11: Backpropagation explained Step by Step cont'd
Part 12: Backpropagation explained Step by Step cont'd
Part 13: Implementing the Backpropagation Algorithm with NumPy
Part 14: How to train a Neural Net

Here is the corresponding Jupyter Notebook for this post:

Notebook

Here are the corresponding slides for this post:

basics_of_dl_13.pdf
File Size:	78 kb
File Type:	pdf

Download File

In the previous post, we left off at the point where we wanted to implement the backpropagation algorithm in code.

See slide 1

So, this is what we are going to do in this post. And actually, this is pretty easy now, because all we have to do is to implement those equations in the slide.

Code

So, in the Jupyter Notebook we first load in the Iris flower data set and then we select the 3 flowers with which we have been working the whole time.

               See code cell 2 in the Jupyter Notebook

Then, we create our input matrix “x” and label matrix “y” (we also determine N which we are going to need in order to calculate the MSE).

               See code cell 3 in the Jupyter Notebook

Then, we define our activation function, namely the sigmoid function.

               See code cell 4 in the Jupyter Notebook

After that, we specify the learning rate and the number of nodes in each layer.

               See code cell 5 in the Jupyter Notebook

In the cell after that, we randomly initialize our two weight matrices.

               See code cell 6 in the Jupyter Notebook

And then, finally we run the feedforward and backpropagation algorithm and execute one gradient descent step.

               See slide 2 and code cell 7 in the Jupyter Notebook

After that, we calculate the MSE (the “output_layer_outputs” are still based on our initial, random weights).

               See slide 3 and code cell 8 in the Jupyter Notebook

So now, let’s see if the gradient descent step worked. So, if we run the feedforward algorithm again (now with the updated weights), the MSE should be somewhat lower.

               See code cells 9-10 in the Jupyter Notebook

And, as you can see, it is indeed somewhat lower.

Next Steps

But, as you can also see, the improvement is only very small. So, we need to run the feedforward and backpropagation algorithms many more times. So, the question now is: How often should we run them?

And one approach might be to say: Let’s just run them until the MSE is zero. But, as we have seen in one of the earlier posts, the gradient descent algorithm gives us just an approximation for the minimum and not an exact value. So, we might never reach an MSE of zero.

So, this begs the question: When should we stop training our neural net? And this will be the topic of the next post.

0 Comments