We'd like to know you better so we can create more relevant courses. What do you do for work?
Course Syllabus
You've achieved today's streak!
Complete one lesson every day to keep the streak going.
Su
Mo
Tu
We
Th
Fr
Sa
You earned a Free Pass!
Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.
Elevate Your Career with Full Learning Experience
Unlock Plus AI learning and gain exclusive insights from industry leaders
Access exclusive features like graded notebooks and quizzes
Earn unlimited certificates to enhance your resume
Starting at $1 USD/mo after a free trial – cancel anytime
The fundamental building block of most modern neural networks is a layer of neurons. In this video, you'll learn how to construct a layer of neurons, and once you have that down, you'll be able to take those building blocks and put them together to form a large neural network. Let's take a look at how a layer of neurons works. Here's the example we had from the demand prediction example, where we had four input features that were fed to this layer of three neurons in this hidden layer that then sends its output to this output layer with just one neuron. Let's zoom in to the hidden layer to look at its computations. This hidden layer inputs four numbers, and these four numbers are inputs to each of three neurons, and each of these three neurons is just implementing a little logistic regression unit or a little logistic regression function. So take this first neuron. It has two parameters, w and b, and in fact, to denote that this is the first hidden unit, I'm going to subscript this as w1, b1, and what it does is it'll output some activation value a, which is g of w1 in a product with x plus b1, where this is the familiar z value that you had learned about in logistic regression in the previous course, and g of z is the familiar logistic function, 1 over 1 plus e to the negative z, and so maybe this ends up being a number 0.3, and that's the activation value a of the first neuron. To denote that this is the first neuron, I'm also going to add a subscript a1 over here, and so a1 may be a number like 0.3, just a 0.3 chance of this being highly affordable based on the input features. Now let's look at the second neuron. The second neuron has parameters w2 and b2, and this wb or w2, b2 are the parameters of the second logistic unit. So it computes a2 equals the logistic function g applied to w2 dot product x plus b2, and this may be some other number, say 0.7, because in this example, there's a 0.7 chance that we think the potential buyers will be aware of this t-shirt. And similarly, the third neuron has a third set of parameters, w3, b3, and similarly computes an activation value a3 equals g of w3 dot product x plus b3, and that may be, say, 0.2. So in this example, these three neurons output 0.3, 0.7, and 0.2, and this vector of three numbers becomes the vector of activation values a that is then passed to the final output layer of this neural network. Now when you build neural networks with multiple layers, it will be useful to give the layers different numbers. So by convention, this layer is called layer one of the neural network, and this layer is called layer two of the neural network, and the input layer is also sometimes called layer zero. And today, there are neural networks that can have dozens or even hundreds of layers. But in order to introduce notation to help us distinguish between the different layers, I'm going to use superscript square bracket one to index into different layers. So in particular, a superscript in square brackets one, I'm going to use as a notation to denote the output of layer one of this hidden layer of this neural network. And similarly, w1, b1 here are the parameters of the first unit in layer one of the neural network, so I'm also going to add the superscript in square brackets one here. And w2, b2 are the parameters of the second hidden unit, or the second hidden neuron in layer one, and so those parameters are also denoted here, w superscript square bracket one, like so. And similarly, I can add superscript square brackets, like so, to denote that these are the activation values of the hidden units of layer one of this neural network. I know maybe this notation is getting a little bit cluttered, but the thing to remember is whenever you see this superscript square bracket one, that just refers to a quantity that is associated with layer one of the neural network. And if you see superscript square bracket two, that refers to a quantity associated with layer two of the neural network, and similarly for other layers as well, including layer three, layer four, and so on for neural networks with more layers. So that's the computation of layer one of this neural network. Its output is this activation vector, a superscript square bracket one, and I'm going to copy this over here because this output, a1, becomes the input to layer two. So now let's zoom into the computation of layer two of this neural network, which is also the output layer. So the input to layer two is the output of layer one, so a1 is this vector, 0.3, 0.7, 0.2, that we just computed on the previous part of this slide. And so because the output layer has just a single neuron, all it does is it computes a subscript one that is the output of this first and only neuron as g, the sigmoid function, applied to w subscript one in a product with a superscript square bracket one. So this is the input into this layer, and then plus b1. Here this is the quantity z that you're familiar with, and g, as before, is the sigmoid function that you apply to this. And if this results in a number, say 0.84, then that becomes the output of this output layer of the neural network. And in this example, because the output layer has just a single neuron, this output is just a scalar, it's a single number rather than a vector of numbers. Sticking with our notational convention from before, we're going to use a superscript in square brackets two to denote the quantities associated with layer two of this neural network. So a superscript square bracket two is the output of this layer, and so I'm going to also copy this here as the final output of the neural network. And to make the notation consistent, you can also add these superscript square bracket twos to denote that these are the parameters and activation values associated with layer two of the neural network. Once the neural network has computed A2, there's one final optional step that you can choose to implement or not, which is if you want a binary prediction, one or zero, is this a top seller, yes or no, is you can take the number, a superscript square bracket two subscript one, and this is the number 0.84 that we computed, and threshold this at 0.5. So if it's greater than 0.5, you can predict y-hat equals one, and if it's less than 0.5, then predict y-hat equals zero, and we saw this thresholding as well when you learned about logistic regression in the first course of the specialization. So if you wish, this then gives you the final prediction y-hat as either one or zero if you don't want just a probability of it being a top seller. So that's how a neural network works. Every layer inputs a vector of numbers and applies a bunch of logistic regression units to it, and then computes another vector of numbers that then gets passed from layer to layer until you get the final output layer's computation, which is a prediction of the neural network. Then you can either threshold at 0.5 or not to come up with the final prediction. And with that, let's go on to use this foundation we've built now to look at some even more complex, even larger neural network models. And I hope that by seeing more examples, this concept of layers and how to put them together to build a neural network will become even clearer. So let's go on to the next video.