So far, you've seen an RNN architecture where the number of inputs, Tx, is equal to the number of outputs, Ty. It turns out that for other applications, Tx and Ty may not always be the same, and in this video you see a much richer family of RNN architectures. You might remember this slide from the first video of this week, where the input x and the output y can be many different types. And it's not always the case that Tx has to be equal to Ty. In particular, in this example, Tx can be length 1, or even an empty set. And then in an example like movie sentiment classification, the output y could be just an integer from 1 to 5, whereas the input is a sequence. And in name entity recognition, in the example we're using, the input length and the output length are identical, but there are also some problems where the input length and the output length can be different. Both are sequences that have different lengths, such as machine translation, where a French sentence and an English sentence can need different numbers of words to say the same thing. But it turns out that we could modify the basic RNN architecture to address all of these problems. And the presentation in this video was inspired by a blog post by Andrej Karpathy, titled The Unreasonable Effectiveness of Recurring New Networks. Let's go through some examples. The example you've seen so far used Tx equals Ty, where we had an input sequence x1, x2, up to xTx, and we had a recurring neural network that worked as follows, where we would input x1 to compute y hat 1, y hat 2, and so on, up to y hat Ty, as follows. And in earlier diagrams, I was drawing a bunch of circles here to denote neurons, but I'm just going to omit those little circles for most of this video, just to make the notation simpler. So this is what you might call a many-to-many architecture, because the input sequence has many inputs as a sequence, and the output sequence also has many outputs. Now let's look at a different example. Let's say you want to address sentiment classification. Here, x might be a piece of text, such as it might be a movie review that says, there is nothing to like in this movie, so x is going to be a sequence, and y might be a number from 1 to 5, or maybe 0 or 1. Is this a positive review or a negative review? Or it could be a number from 1 to 5. Do you think this is a 1-star, 2-star, 3, 4, or 5-star review? So in this case, we can simplify the neural network architecture as follows. We could have an input x1, x2, so input the words one at a time. So if the input text was, there is nothing to like in this movie, so there is nothing to like in this movie, would be the input. And then, rather than having to use an output at every single time step, we can then just have the RNN read in the entire sentence and have it output y at the last time step when it has already input the entire sentence. So this neural network would be a many-to-one architecture, because it has many inputs, it inputs many words, and it just outputs one number. For the sake of completeness, there is also a one-to-one architecture. So this one is maybe not terribly, so this one is maybe less interesting. It's more like a standard neural network where you have some input x and you just have some output y. So this would be the type of neural network that we covered in the first two courses in this sequence. Now in addition to many-to-one, you can also have a one-to-many architecture. So an example of a one-to-many neural network architecture would be music generation. And in fact, you get to implement this yourself in one of the primary exercises for this course, where you go is to have a neural network output a set of notes corresponding to a piece of music. And the input x could be maybe just an integer telling you what genre of music you want or what's the first note of the music you want. And if you don't want to input anything, x could be a null input or it could always be the vector of zeros as well. For that, the neural network architecture would be you input x and then have your RNN output the first value and then have that with no further inputs output the second value and then go on to output the third value and so on until you've synthesized the last note of the musical piece. And if you want, you can have this input a zero as well. One technical note, what you see in a later video is that when you're actually generating sequences, often you take this first synthesized output and feed it to the next layer as well. So the network architecture actually ends up looking like that. So we've talked about many-to-many, many-to-one, one-to-many, as well as one-to-one. It turns out there's one more interesting example of many-to-many, which is worth describing, which is when the input and the output lengths are different. So in the many-to-many example you saw just now, the input length and the output length have to be exactly the same. But for an application like machine translation, the number of words in the input sentence, say a French sentence, and the number of words in the output sentence, say the translation into English, those sentences could be different lengths. So here's an alternative neural network architecture, where you might have a neural network first read in the sentence. So first read in the input, say French sentence that you want to translate to English. And having done that, you then have the neural network output the translation as follows. And so with this architecture, Tx and Ty can be different lengths. And again, you could draw in the Az or the F1, and so this neural network architecture has two distinct parts. There's the encoder, which takes those inputs, say a French sentence, and then there's a decoder, which having read in the sentence, outputs the translation into a different language. So this would be an example of a many-to-many architecture. So by the end of this week, you have a good understanding of all the components needed to build these types of architectures. And then technically, there's one other architecture which we'll talk about only in week four, which is attention-based architectures, which maybe isn't clearly captured by one of the diagrams we've drawn so far. So to summarize the wide range of RNN architectures, there is a one-to-one. Although if it's one-to-one, we could just get rid of this, and this is just a standard generic neural network, really don't need an RNN for this. But there is a one-to-many, so this was music generation or sequence generation as an example. And then there's a many-to-one, that would be an example of sentiment classification, where you might want to read as input all the text with a movie review and then try to figure out, you know, did they like the movie or not. There is many-to-many, so the name entity recognition example we've been using was this, where TX is equal to TY. And then finally, there's this other version of many-to-many, where for applications like machine translation, TX and TY no longer have to be the same. So now you know most of the building blocks for building pretty much all of these neural networks, except that there's some subtleties with sequence generation, which is what we'll discuss in the next video. So I hope you saw from this video that using the basic building blocks of an RNN, there's already a wide range of models that you might be able to put together. But as I mentioned, there are some subtleties to sequence generation, which you get to implement yourself as well in this week's program exercise, where you implement a language model and hopefully generate some fun sequences or some fun pieces of text. So what I want to do in the next video is go deeper into sequence generation. Let's see the details in the next video.

Deep Learning Specialization

Intermediate

Topics

Computer Vision

Deep Learning

NLP

Supervised Learning

Transformers

Collaborator

DeepLearning.AI

Week 1: Recurrent Neural Networks

Recurrent Neural Networks

Why Sequence Models?
Video
・
2 mins

Notation
Video
・
8 mins

Recurrent Neural Network Model
Video
・
16 mins

Backpropagation Through Time
Video
・
6 mins

Different Types of RNNs
Video
・
9 mins

Language Model and Sequence Generation
Video
・
12 mins

Sampling Novel Sequences
Video
・
8 mins

Vanishing Gradients with RNNs
Video
・
6 mins

Clarifications about Upcoming Gated Recurrent Unit (GRU) Video
Reading
・
1 min

Gated Recurrent Unit (GRU)
Video
・
16 mins

Clarifications about Upcoming Long Short Term Memory (LSTM) Video
Reading
・
1 min

Long Short Term Memory (LSTM)
Video
・
9 mins

Bidirectional RNN
Video
・
8 mins

Deep RNNs
Video
・
5 mins

Lecture Notes (Optional)

Lecture Notes W1
Reading
・
1 min

Quiz

Recurrent Neural Networks

Graded・Quiz

・

50 mins

Programming Assignments

(Optional) Downloading your Notebook and Refreshing your Workspace
Reading
・
5 mins

Building your Recurrent Neural Network - Step by Step

Graded・Code Assignment

・

3 hours

Dinosaur Island-Character-Level Language Modeling

Graded・Code Assignment

・

3 hours

Jazz Improvisation with LSTM

Graded・Code Assignment

・

3 hours

Week 2: Natural Language Processing & Word Embeddings