This post is part of a series:
Here is the corresponding Jupyter notebook for this post.
In the previous post, we have seen how the deep learning algorithm makes a decision, namely with the feedforward algorithm.
See slide 1 And how to implement the feedforward algorithm in code is the topic of this post. Iris Flower Data Set
And therefore, we are again using the Iris flower data set as our use case. And the specific neural net, that we want to apply to the data set, looks like this:
See slide 2 So, we have four input values: the sepal length and width and the petal length and width. Then, there are two nodes in the hidden layer. And finally, we have 3 output nodes. And that’s because our artificial neuron can only output either a 0 or a 1. So, to represent the three different species, we need three neurons in the output layer. And this means that we have to one-hot encode our label, the species of the flower. So, instead of having one column that contains the different classes, we are going to create three label columns, one for each class. See slide 3 So, if a flower is an Iris-setosa, then there is a one in the Iris-setosa column and a zero in the other two columns. If it is a versicolor, then there is a one in the Iris-versicolor column and a zero in the other two. And if it is a virginica, then there is a one in the Iris-virginica column and a zero in the other two. And then accordingly, when we input a new, unknown flower into our neural net, only one of the nodes in the output layer should output a one while the others are zero (depending on what species the neural net predicts the flower to be). Output of one Neuron
So now, to implement the feedforward algorithm in code, let’s start with implementing the functionality of just one neuron. After all, that’s the fundamental unit of a neural net.
See code cells 3-7 in the Jupyter Notebook In cell 4, the “inputs” variable represents one example flower from the data set (it is the flower with index 1 as you can see in the output of cell 2). And the “weights” variable contains the weights for the neuron depicted in cell 3 (I have simply specified them for the purpose of this tutorial). In cell 5, the “weighted_sum” function is defined. This represents the converging arrows of the neuron. So, this is the value that goes into the circle or node. And then, in cell 6, the “step_function” is defined. This will determine what comes out of the node. And in cell 7, we can see that, for our example flower, the value that goes into the node is 5.21. So, accordingly, the neuron outputs a 1. Output of the Neural Network
So, that’s how you could implement one neuron, but obviously our neural net consists of many such neurons.
See code cell 8 in the Jupyter Notebook So, how can we now determine the output for the whole net? One thing we could do, is to simply use our “weighted_sum” function and “step_function” to determine the outputs of the two nodes in the hidden layer. Then, we could store those values into variables. And after that, we could use those variables to run the functions again to calculate the outputs of the 3 nodes in the output layer. But that seems a little bit tedious and, on top of that, for larger networks with hundreds or thousands or even millions of nodes, this approach is not practical. So, what can we do instead? Well, we can take advantage of the fact that the neural net is structured in layers. So, instead of determining the output of the nodes individually, we can determine the output of a whole layer in just one step. So, that’s the function that we want to create now. See code cell 10 in the Jupyter Notebook So, with this function we could determine the output of the hidden layer by putting in the “inputs” and “weights_1” (see cell 9). And then, since the neural net is made up of sequential layers, we can simply call this function again to determine the output of the next layer. Only this time, obviously, we put the outputs of the hidden layer and “weights_2” into the function. And this will then give us the output of the whole neural net. List of lists
So, that’s the function that we want to create. And now, let’s think about the structure of the “list_of_inputs”, “list_of_weights” and “layer_outputs”. Or in other words, let’s think about what they will actually look like in the code.
Therefor, we are going to look at the neural net turned by 90 degrees so that it is upright. See slide 4 This way, I think, it is easier to remember what the structure of the individual parts looks like in code (especially if you are not so familiar with neural nets yet). And since our function determines the outputs of just one layer, let’s look only at the bottom part of the neural net, so at the input layer and hidden layer. See slide 5 Okay, so first, let’s look at the weights. See slide 6 This time, we have to consider several nodes and not just one like we did before when we determined the output of just one neuron. So consequently, the weights parameter of the function is not just going to be a list, but a list of lists (that’s why the parameter of the function is called “list_of_weights”). And each list in that overall list contains the weights to a particular neuron. So, for example, in the first list are the weights that go to the first node. See slide 7
So, the 0.9 is the weight going from x1 to the first node. The 0.8 is the weight going from x2 to the first node, and so on.
And the second list contains the weights that go to the second node.
See slide 8 If we would have, for example, four nodes in the hidden layer, then we would have four lists in the overall list. So, that’s what the “list_of_weights” parameter looks like. And the “list_of_inputs” parameter is also going to be a list of lists. See slide 9 And that’s because in machine learning, we are always dealing with many examples and not just one (like we did before with the one neuron). So, in our case here, we have 3 examples. And each list in the list of lists represents one specific flower. And since we have four features, each sub-list in the overall list is going to have 4 elements. And now let’s think about what the “layer_outputs” variable, that is returned by the function, is going to look like. See slide 10 And obviously, it will also be a list of lists and the number of elements in each sub-list will be, in this case, two. And that’s because there are two nodes in the hidden layer. So, the first number in a sub-list is the output of the first node and the second number is the output of the second node. And we want to determine the outputs of those hidden nodes for each of our examples. So, we again are going to have 3 sub-lists in the overall list. The first list contains the outputs of the hidden nodes for the first example. The second contains the outputs of the hidden neurons for the second example, and so on. And now, you can hopefully understand why I turned the neural net upright. Because this way, you can basically just look at the neural net and see what the underlying structure of the different parts is. So, what they will look like in the code.
Namely, in this case, we have 4 features (x1-x4). So, the inputs for this neural net is going to be a list of lists with, so to say, 4 columns. Then, in the hidden layer, we have 2 nodes. So, the hidden layer outputs are going to be a list of lists with 2 columns.
And for the number of rows of those lists of lists, you then just have to remember the number of examples that we are working with, in this case 3. And no matter how many layers our neural net is going to have, each layer output will consist of a list of lists with, in this case, 3 rows. Just the number of columns is going to change based on how many nodes there are in the layer.
See slide 11 So, the structure of the output layer is going to be a list of lists with 3 rows and 3 columns. So, that’s how you can easily remember what those layers actually look like in the code. And unfortunately, the structure of the weights is not as clearly depicted by this upright neural net. But in a later post, we will see how that underlying structure is also very easy to keep in mind. Okay, so now that we know what the inputs and the output of the “determine_layer_outputs” function are going to look like, let’s start implementing it. And this will be the topic of the next post.
0 Comments
Leave a Reply. |
AuthorJust someone trying to explain his understanding of data science concepts Archives
November 2020
Categories
|