Let's see what happens when you run gradient descent for linear regression. Let's go see the algorithm in action. Here's a plot of the model and data on the upper left, and a contour plot of the cost function on the upper right, and at the bottom is the surface plot of the same cost function. Often, w and v will both be initialized to 0, but for this demonstration, let's initialize w to be equal to negative 0.1 and b to be 900. So this corresponds to f of x equals negative 0.1x plus 900. Now, if we take one step using gradient descent, we end up going from this point of the cost function out here to this point, just down and to the right. And notice that the straight line fit is also changed a bit. Let's take another step. The cost function has now moved to this third point, and again, the function f of x has also changed a bit. As you take more of these steps, the cost is decreasing at each update. So the parameters w and b are following this trajectory. And if you look on the left, you get this corresponding straight line fit that, you know, fits the data better and better until we've reached the global minimum. The global minimum corresponds to this straight line fit, which is a relatively good fit to the data. I mean, isn't that cool? And so that's gradient descent. And we're going to use this to fit a model to the housing data. And you can now use this f of x model to predict the price of your client's house or anyone else's house. For instance, if your friend's house size is 1,250 square feet, you can now read off the value and predict that maybe they could get, I don't know, $250,000 for the house. To be more precise, this gradient descent process is called batch gradient descent. The term batch gradient descent refers to the fact that on every step of gradient descent, we're looking at all of the training examples instead of just a subset of the training data. So in computing gradient descent, when computing derivatives, we're computing the sum from i equals 1 to m. And batch gradient descent is looking at the entire batch of training examples at each update. I know that batch gradient descent may not be the most intuitive name, but this is what people in the machine learning community call it. If you've heard of the newsletter, The Batch, that's published by deeplearning.ai, the newsletter, The Batch, was also named for this concept in machine learning. And then it turns out that there are other versions of gradient descent that do not look at the entire training set, but instead looks at smaller subsets of the training data at each update step. But we'll use batch gradient descent for linear regression. So that's it for linear regression. Congratulations on getting through your first machine learning model. I hope you go and celebrate, or I don't know, maybe take a nap in your hammock. In the optional lab that follows this video, you'll see a review of the gradient descent algorithm as well as how to implement it in code. You'll also see a plot that shows how the cost decreases as you continue training more iterations. And you'll also see a contour plot, seeing how the cost gets closer to the global minimum as gradient descent finds better and better values for the parameters W and B. So remember that to do the optional lab, you just need to read and run this code. You won't need to write any code yourself. And I hope you take a few moments to do that and also become familiar with the gradient descent code, because this will help you to implement this and similar algorithms in the future yourself. Thanks for sticking with me through the end of this last video for the first week, and congratulations for making it all the way here. You're on your way to becoming a machine learning person. In addition to the optional labs, if you haven't done so yet, I hope you also check out the practice quizzes, which are a nice way that you can double check your own understanding of the concepts. It's also totally fine if you don't get them all right the first time, and you can also take the quizzes multiple times until you get the score that you want. You now know how to implement linear regression with one variable, and that brings us to the close of this week. Next week, we'll learn to make linear regression much more powerful. Instead of one feature like size of a house, you learn how to get it to work with lots of features. You also learn how to get it to fit nonlinear curves. These improvements will make the algorithm much more useful and valuable. Lastly, we'll also go over some practical tips that will really help for getting linear regression to work on practical applications. I'm really happy to have you here with me in this class, and I look forward to seeing you next week.