Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.
Welcome back!
Hi ,
We'd like to know you better so we can create more relevant courses. What do you do for work?
Course Syllabus
You've achieved today's streak!
Complete one lesson every day to keep the streak going.
Su
Mo
Tu
We
Th
Fr
Sa
You earned a Free Pass!
Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.
Elevate Your Career with Full Learning Experience
Unlock Plus AI learning and gain exclusive insights from industry leaders
Access exclusive features like graded notebooks and quizzes
Earn unlimited certificates to enhance your resume
Starting at $1 USD/mo after a free trial – cancel anytime
In the previous video, I mentioned that broadcasting is another technique that you can use to make your Python code run faster. In this video, let's delve into how broadcasting in Python actually works. Let's motivate broadcasting with an example. In this matrix, I've shown the number of calories from carbohydrates, proteins, and fats in 100 grams of four different foods. So for example, 100 grams of apples turns out as 56 calories from carbs and much less from proteins and fats, whereas in contrast, 100 grams of beef has 104 calories from protein, 135 calories from fat. Now let's say your goal is to calculate the percentage of calories from carbs, proteins, and fats for each of the four foods. So for example, if you look at this column and add up the numbers in that column, you get that 100 grams of apple has 56 plus 1.2 plus 1.8, so that's 59 calories. And so as a percentage, the percentage of calories from carbohydrates in an apple would be 56 over 59, that's about 94.9 percent. So most of the calories in an apple come from carbs, whereas in contrast, most of the calories of beef come from protein and fat and so on. So the calculation you want is really to sum up each of the four columns of this matrix to get the total number of calories in 100 grams of apple, beef, eggs, or potatoes, and then to divide throughout the matrix so as to get the percentage of calories from carbs, proteins, and fats for each of the four foods. So the question is, can you do this without an explicit for loop? Let's take a look at how you could do that. What I'm going to do is show you how you can set, say, this matrix equal to a 3 by 4 matrix A, and then with one line of Python code, we're going to sum down the columns. So we're going to get four numbers corresponding to the total number of calories in these four different types of food, in 100 grams of these four different types of food, and I'm going to use a second line of Python code to divide each of the four columns by their corresponding sum. If that verbal description wasn't very clear, hopefully it'll be clearer in a second when we look at the Python code. So here we are in the Jupyter notebook. I've already written this first piece of code to pre-populate the matrix A with the numbers we had just now, so let me hit shift enter and just run that. So there's the matrix A, and now here are the two lines of Python code. First, I'm going to compute cal equals A dot sum, and x is equal to zero, means the sum vertically. Say more about that in a little bit. And then let's print cal. So we'll sum vertically. So as we say just now, 59 is the total number of calories in the apple, 239 was the total number of calories in the beef, and eggs, and potato, and so on. And then we're going to compute percentage equals A over cal dot reshape, 1 by 4. Actually, we want percentages multiplied by 100 here. And then let's print percentage. Let's run that. And so with that command, we've taken the matrix A and divided it by this 1 by 4 matrix, and this gives us the matrix of percentages. So as we worked out kind of by hand just now, you know, in the apple, that was the first column, 94.9% of the calories are from carbs. Let's go back to the slides. So just to repeat the two lines of code we had, this is what we had written out in the Jupyter notebook. To add a bit of detail, this parameter, axis equals 0, means that you want Python to sum vertically. So this is axis 0, means to sum vertically, whereas the horizontal axis is axis 1. So if you were to write axis 1, you'll sum horizontally instead of sum vertically. And then this command here, this is an example of Python broadcasting where you take a matrix A, so this is a 3 by 4 matrix, and you divide it by a 1 by 4 matrix. And technically, after this first line of code, cal, the variable cal, is already a 1 by 4 matrix. So technically, you don't need to call reshape here again, so that's actually a little bit redundant. But when I'm writing Python codes, if I'm not entirely sure what matrix, what are the dimensions of my matrix, I often would just call a reshape command just to make sure that it's, you know, the right column vector or the row vector or whatever you want it to be. The reshape command is a constant time, it's an order 1 operation, so it's very cheap to call. So don't be shy about using the reshape command to make sure that your matrices are the size you need it to be. Now, let's explain in greater detail how this type of operation works, right? We had a 3 by 4 matrix, and we divided it by a 1 by 4 matrix. So how can you divide a 3 by 4 matrix by a 1 by 4 matrix or by a 1 by 4 vector? Let's go through a few more examples of broadcasting. If you take a 4 by 1 vector and add it to a number, what Python will do is take this number and auto expand it into a 4 by 1 vector as well, as follows. And so the vector 1, 2, 3, 4 plus the number 100 ends up with that vector on the right. You're adding 100 to every element. And in fact, we use this form of broadcasting where that constant was the parameter b in an earlier video. And this type of broadcasting works with both column vectors and row vectors. And in fact, we use a similar form of broadcasting earlier with the constant we're adding to a vector being the parameter b in logistic regression. Here's another example. Let's say you have a 2 by 3 matrix and you add it to this 1 by n matrix. So the general case would be if you have some m by n matrix here and you add it to a 1 by n matrix. What Python will do is copy the matrix m times to turn this into an m by n matrix. So instead of this 1 by 3 matrix, they'll copy it twice in this example to turn it into this also 2 by 3 matrix. And they'll add these so you end up with the sum on the right. Okay, so you've taken, you added 100 to the first column, added 200 to the second column, added 300 to the third column. And this is basically what we did on the previous slide except that we use a division operation instead of an addition operation. Just one last example. What if you have a m by n matrix and you add this to an m by 1 vector, an m by 1 matrix? Then this will copy this n times horizontally so you end up with an m by n matrix. So as you can imagine, you'll copy it horizontally three times and add those. So when you add them, you end up with this. So we've added 100 to the first row and added 200 to the second row. Here's the more general principle of broadcasting in Python. If you have an m by n matrix and you add or subtract or multiply or divide with a 1 by n matrix, then this will copy it m times into an m by n matrix and then apply the addition, subtraction, multiplication, or division element-wise. If conversely, you were to take the m by n matrix and add, subtract, multiply, or divide by an m by 1 matrix, then also this will copy it now n times and turn that into an m by n matrix and then apply the operation element-wise. Just one other form of broadcasting, which is if you have an m by 1 matrix, so that's really a column vector like 1, 2, 3, and you add, subtract, multiply, or divide by a real number, so maybe a 1 by 1 matrix, so such that plus 100, then you end up copying this real number m times until you also get another m by 1 matrix and then you perform the operation, such as addition in this example, element-wise. And something similar also works for row vectors. The fully general version of broadcasting can do even a little bit more than this. If you're interested, you can read the documentation for NumPy and look at broadcasting in that documentation. That gives an even slightly more general definition of broadcasting, but the ones on this slide are the main forms of broadcasting that you end up needing to use when you implement a neural network. Before we wrap up, just one last comment, which is for those of you that are used to programming in either MATLAB or Octave, if you've ever used the MATLAB or Octave function, bsxfun, in neural network programming, bsxfun does something similar, not quite the same, but is often used for a similar purpose as what we use broadcasting in Python for. But this is really only for very advanced MATLAB and Octave users. If you've not heard of this, don't worry about it. You don't need to know it when you're coding up neural networks in Python. So that was broadcasting in Python. I hope that when you do the programming homework, that broadcasting will allow you to not only make your code run faster, but also help you get what you want done with fewer lines of code. Before you dive into the programming exercise, I want to share with you just one more set of ideas, which is that there's some tips and tricks that I found reduces the number of bugs in my Python code and that I hope will help you too. So let's talk about that in the next video.