Basics of Deep Learning p.1 - Introduction

1/6/2020

This post is part of a series:

Part 1: Introduction
Part 2: Feedforward Algorithm explained
Part 3: Implementing the Feedforward Algorithm in pure Python
Part 4: Implementing the Feedforward Algorithm in pure Python cont'd
Part 5: Implementing the Feedforward Algorithm with NumPy
Part 6: Backpropagation explained - Cost Function and Derivatives
Part 7: Backpropagation explained - Gradient Descent and Partial Derivatives
Part 8: Backpropagation explained - Chain Rule and Activation Function
Part 9: Backpropagation explained Step by Step
Part 10: Backpropagation explained Step by Step cont'd
Part 11: Backpropagation explained Step by Step cont'd
Part 12: Backpropagation explained Step by Step cont'd
Part 13: Implementing the Backpropagation Algorithm with NumPy
Part 14: How to train a Neural Net

Here are the corresponding slides for this post:

basics_of_dl_1.pdf
File Size:	166 kb
File Type:	pdf

Download File

In this series of posts, we are going to cover the basics of deep learning. So, first of all: What is deep learning?

See slide 1

It is a specific type of a machine learning algorithm. In fact, it’s a whole class of algorithms that all have some basic properties in common but that can look quite different based on what they are supposed to achieve.

But right now, it’s only important to know that deep learning is a subset of machine learning. And this obviously means that it makes sense to first have a general understanding of machine learning itself.

So, if you don’t have that yet, you can check out my “What is Machine Learning?”-series where I talk about exactly that. And if you already have a good understanding, then let’s now talk about the two essential aspects that need to be considered when generally thinking about any supervised machine learning algorithm.

First Aspect of any Machine Learning Algorithm

The first one is to answer the question: How does the algorithm actually make a decision? Or in machine learning terms: How does it make a prediction?

See slide 2

So, for example, the decision tree algorithm makes a decision by consecutively asking questions about an example. That way, the number of potential classes, that this example could be, is narrowed down step by step. And eventually, just one class is left which is then our prediction.

In the case of linear regression, the way we make a prediction is that we first realize that there seems to be a linear relationship between one variable and another. So, in this example, between the height of a person and the weight of that person. And because there is such a linear relationship, the relationship can be approximated by a line. And the respective equation for that line, is depicted in the slide.

And then, to predict what the weight of any new, unknown person is, we simply measure their height and then we put that value into the equation. And that way, we get a prediction for how much this person is probably weighing.

Second Aspect of any Machine Learning Algorithm

The second essential component of any supervised machine learning algorithm is: How do you actually determine the right parameters of the algorithm?

               See slide 3

And the parameters are, so to say, the tuning knobs of our algorithm. The things that we can adjust so that the algorithm eventually is able to make correct predictions.

In the case of decision trees, the parameters that we can adjust are the specific questions that we are going to ask about the respective examples. And to find those, we used a particular algorithm.

               See slide 4

And here, the actual step for determining the questions was the one called “Lowest Overall Entropy?”. And the formula for the overall entropy looked like this:

               See slide 5

And here again, the most important element was the entropy itself. And its formula, in turn, looked like this:

               See slide 6

So, at the core, we find the right parameters for our decision tree, so what questions to ask, by calculating this entropy.

And for the linear regression algorithm, the parameters that we can adjust are the slope of the line and the y-intercept. By changing the slope, we can change the steepness of the line. And by changing the y-intercept, we can move the line up or down. And one way for determining the best values for those parameters, so that the line best represents the data points, is for example a method called ordinary least squares.

               See slide 7

Conclusion

So, to conclude, as you can probably see in slide 7, this first aspect of the machine learning algorithm is, generally speaking, less heavily based on math. And it is more about the conceptual idea of the algorithm. So, how one might approach the problem of implementing an automated decision-making process. And the second part then, relies more heavily on math to actually make the implementation of this high-level concept possible (at least, that’s my observation).

See slide 8

And the same thing will be true for the deep learning algorithm.

See slide 9

So next, we are going to look at what the general idea behind deep learning is and how it makes decisions.

0 Comments