We'd like to know you better so we can create more relevant courses. What do you do for work?
Course Syllabus
You've achieved today's streak!
Complete one lesson every day to keep the streak going.
Su
Mo
Tu
We
Th
Fr
Sa
You earned a Free Pass!
Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.
Elevate Your Career with Full Learning Experience
Unlock Plus AI learning and gain exclusive insights from industry leaders
Access exclusive features like graded notebooks and quizzes
Earn unlimited certificates to enhance your resume
Starting at $1 USD/mo after a free trial – cancel anytime
So, let's take a look at the general form of how you multiply two matrices together. And then, in the last video after this one, we'll take this and apply it to the vectorized implementation of a neural network. Let's dive in. Here's a matrix A, which is a 2 by 3 matrix, because it has two rows and three columns. As before, I'd encourage you to think of the columns of this matrix as three vectors. Vectors A1, A2, and A3. And what we're going to do is take A transpose and multiply that with a matrix W. The first, what is A transpose? Well, A transpose is obtained by taking the first column of A and laying it on its side like this, and then taking the second column of A and laying it on its side like this, and then the third column of A and laying it on its side like that. And so these rows are now A1 transpose, A2 transpose, and A3 transpose. Next, here's a matrix W. I encourage you to think of W as vectors, W1, W2, W3, and W4 stacked together. Let's look at how you then compute A transpose times W. Now, notice that I've also used slightly different shades of orange to denote the different columns of A, where the same shade corresponds to numbers that we think of as grouped together into a vector, and that same shade is used to indicate different rows of A transpose, because the different rows of A transpose are A1 transpose, A2 transpose, and A3 transpose. And in a similar way, I've used different shades to denote the different columns of W, because the numbers of the same shade of blue are the ones that are grouped together to form the vectors W1, or W2, or W3, or W4. Now let's look at how you can compute A transpose times W. I'm going to draw vertical bars with the different shades of blue, and horizontal bars with the different shades of orange to indicate which elements of Z, that is A transpose W, are influenced or affected by the different rows of A transpose, and which are influenced or affected by the different columns of W. So for example, let's look at the first column of W, so that's W1, as indicated by the lightest shade of blue here. So W1 will influence or correspond to this first column of Z, shown here by this lightest shade of blue, and the values of this second column of W, that is W2, as indicated by this second lightest shade of blue, will affect the values computed in the second column of Z, and so on for the third and fourth columns. Correspondingly, let's look at A transpose. A1 transpose is the first row of A transpose, as indicated by the lightest shade of orange, and A1 transpose will affect, or influence, or correspond to the values in the first row of Z, and A2 transpose will influence the second row of Z, and A3 transpose will influence or correspond to this third row of Z. So let's figure out how to compute the matrix Z, which is going to be a 3 by 4 matrix, so with 12 numbers all together. Let's start off and figure out how to compute the number in the first row and the first column of Z, so this upper left most element here. Because this is the first row and the first column corresponding to the lightest shade of orange and the lightest shade of blue, the way you compute that is to grab the first row of A transpose and the first column of W, and take their inner product, or the dot product. And so this number is going to be 1 2 dot product with 3 4, which is 1 times 3 plus 2 times 4, which is equal to 11. Let's look at a second example. How would you compute this number, this element of Z? So this is in the third row, row 1, row 2, row 3, so this is in row 3, and the second column, column 1, column 2. So to compute the number in row 3, column 2 of Z, you would now grab row 3 of A transpose and column 2 of W, and dot product those together. Notice that this corresponds to the darkest shade of orange and the second lightest shade of blue. And to compute this, this is 0.1 times 5 plus 0.2 times 6, which is 0.5 plus 1.2, which is equal to 1.7. So to compute the number in row 3, column 2 of Z, you grab the third row, row 3 of A transpose and column 2 of W. Let's look at one more example, and let's see if you can figure this one out. So this is row 2, column 3 of the matrix Z. Why don't you take a look and see if you can figure out which row and which column to grab to dot product together, and therefore, what is the number that will go in this element of this matrix. Hopefully, you got that you should be grabbing row 2 of A transpose and column 3 of W, and when you dot product that together, you have A2 transpose W3 is negative 1 times 7 plus negative 2 times 8, which is negative 7 plus negative 16, which is equal to negative 23. And so that's how you compute this element of the matrix Z. And it turns out if you do this for every element of the matrix Z, then you can compute all of the numbers in this matrix, which turns out to look like that. Feel free to pause the video if you want, and pick any element, and double check that the formula we've been going through gives you the right value for Z. I just want to point out one last interesting requirement for multiplying matrices together, which is that X transpose here is a 3 by 2 matrix, because it has 3 rows and 2 columns, and W here is a 2 by 4 matrix, because it has 2 rows and 4 columns. One requirement in order to multiply two matrices together is that this number must match that number, and that's because you can only take dot products between vectors that are the same length. So you can take the dot product between a vector with two numbers, and that's because you can take the inner product between a vector of length 2 only with another vector of length 2. You can't take the inner product between a vector of length 2 with a vector of length 3, for example. And that's why matrix multiplication is valid only if the number of columns of the first matrix, that is A transpose here, is equal to the number of rows of the second matrix, which is the number of rows of W here. So that when you take dot products during this process, you're taking dot products of vectors of the same size. And then the other observation is that the output Z equals A transpose W, the dimensions of Z is 3 by 4. And so the output of this multiplication will have the same number of rows as X transpose, and the same number of columns as W. And so that too is another property of matrix multiplication. So that's matrix multiplication. All of these videos are optional, so thank you for sticking with me through these. And if you're interested, later in this week there are also some purely optional quizzes to let you practice some more of these calculations yourself as well. So with that, let's take what we've learned about matrix multiplication and apply it back to the vectorized implementation of a neural network. I have to say, the first time I understood the vectorized implementation, I thought it was actually really cool. I've been implementing neural networks for a while myself without the vectorized implementation, and when I finally understood the vectorized implementation and implemented it that way for the first time, it ran blazingly much faster than anything I've ever done before. And I thought, wow, I wish I had figured this out earlier. The vectorized implementation, it is a little bit complicated, but it makes neural networks run much faster. So let's take a look at that in the next video.