Quick Guide & Tips

💻   Accessing Utils File and Helper Functions

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.


💻   Downloading Notebooks

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Download as"

3:   Then, click on "Notebook (.ipynb)"


💻   Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.


📗   See Your Progress

Once you enroll in this course—or any other short course on the DeepLearning.AI platform—and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).


📱   Features to Use

🎞   Adjust Video Speed: Click on the gear icon (⚙) on the video and then from the Speed option, choose your desired video speed.

🗣   Captions (English and Spanish): Click on the gear icon (⚙) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

🔅   Video Quality: If you do not have access to high-speed internet, click on the gear icon (⚙) on the video and then from Quality, choose the quality that works the best for your Internet speed.

🖥   Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

√   Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.


🧑   Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

🧑   Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

📅   Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

☕   Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

💬   Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

✍   Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.


📚   Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. 👇

👉👉 🔗 DeepLearning.AI – All Short Courses [+]


🙂   Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community 👉👉 🔗 DeepLearning.AI Forum


Sign in

Create Your Account

Or, sign up with your email
Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Learning Path

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.

View All Plans and Features

Manage Your Learning Plan

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.

Welcome back!

Hi ,

We'd like to know you better so we can create more relevant courses. What do you do for work?

Course Syllabus

DeepLearning.AI
    daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass

    Elevate Your Career with Full Learning Experience

    Unlock Plus AI learning and gain exclusive insights from industry leaders

    Access exclusive features like graded notebooks and quizzes
    Earn unlimited certificates to enhance your resume
    Starting at $1 USD/mo after a free trial – cancel anytime
Welcome to this course on the practical aspects of deep learning. Perhaps now you've learned how to implement a neural network. In this week, you'll learn the practical aspects of how to make your neural network work well, ranging from things like hyperparameter tuning, to how to set up your data, to how to make sure your optimization algorithm runs quickly, so that you get your learning algorithm to learn in a reasonable amount of time. In this first week, we'll first talk about how to set up your machine learning problem, then we'll talk about regularization, and we'll talk about some tricks for making sure your neural network implementation is correct. With that, let's get started. Making good choices in how you set up your training, development, and test sets can make a huge difference in helping you quickly find a good, high-performance neural network. When training a neural network, you have to make a lot of decisions, such as how many layers will your neural network have, and how many hidden units do you want each layer to have, and what's the learning rate, and what are the activation functions you want to use for the different layers. When you're starting on a new application, it's almost impossible to correctly guess the right values for all of these, and for other hyperparameter choices on your first attempt. So in practice, applying machine learning is a highly iterative process in which you often start with an idea, such as you want to build a neural network of a certain number of layers, a certain number of hidden units, maybe on certain data sets, and so on, and then you just have to code it up and try it. By running your code, you run and experiment, and get back a result that tells you how well this particular network or this particular configuration works, and based on the outcome, you might then refine your ideas and change your choices, and maybe keep iterating in order to try to find a better and a better neural network. Today, deep learning has found great success in a lot of areas, ranging from natural language processing, to computer vision, to speech recognition, to a lot of applications on also structured data. And structured data includes everything from advertisement to web search, which isn't just internet search engines, it's also, for example, shopping websites, or really any website that wants to deliver great search results when you enter terms into a search bar, to computer security, to logistics, such as figuring out where to send drivers to pick up and drop off things, to many more. So what I've seen is that sometimes a researcher with a lot of experience in NLP might enter, might try to do something in computer vision, or maybe a researcher with a lot of experience in speech recognition might jump in and try to do something on advertising, or someone from security might want to jump in and do something on logistics. And what I've seen is that intuitions from one domain or from one application area often do not transfer to other application areas. And the best choices may depend on the amount of data you have, the number of input features you have, your computer configuration, and whether you're training on GPUs or CPUs, and so exactly what configuration of GPUs and CPUs, and many other things. So for a lot of applications, I think it's almost impossible. Even very experienced deep learning people find it almost impossible to correctly guess the best choice of hyperparameters the very first time. And so today, applied deep learning is a very iterative process where you just have to go around this cycle many times to hopefully find a good choice of network for your application. So one of the things that will determine how quickly you can make progress is how efficiently you can go around this cycle. And setting up your data sets well in terms of your train, development, and test sets can make you much more efficient at that. So if this is your training data, let's draw that as a big box, then traditionally you might take all the data you have and carve off some portion of it to be your training set, some portion of it to be your holdout cross-validation set. And this is sometimes also called the development set. And for brevity, I'm just going to call this the dev set, but all of these terms mean roughly the same thing. And then you might carve out some final portion of it to be your test set. And so the workflow is that you keep on training algorithms on your training set and use your dev set or your holdout cross-validation set to see which of many different models performs best on your dev set. And then after having done this long enough, when you have a final model that you want to evaluate, you can take the best model you have found and evaluate it on your test set in order to get an unbiased estimate of how well your algorithm is doing. So in the previous era of machine learning, it was common practice to take all your data and split it according to maybe a 70-30 percent in terms of people often talk about the 70-30 train test split, if you don't have an explicit dev set, or maybe a 60-20-20 percent split in terms of 60 percent train, 20 percent dev, and 20 percent test. And several years ago, this was widely considered best practice in machine learning. If you have maybe 100 examples in total, maybe 1,000 examples in total, maybe up to 10,000 examples, these sorts of ratios were perfectly reasonable rules of thumb. But in the modern big data era, where, for example, you might have a million examples in total, then the trend is that your dev and test sets have been becoming a much smaller percentage of the total. Because remember, the goal of the dev set, the development set, is that you're going to test different algorithms on it and see which algorithm works better. So the dev set just needs to be big enough for you to evaluate, say, two different algorithm choices or 10 different algorithm choices and quickly decide which one is doing better. And you might not need a whole 20 percent of your data for that. So, for example, if you have a million training examples, you might decide that just having 10,000 examples in your dev set is more than enough to evaluate which one of two algorithms does better. And in a similar vein, the main goal of your test set is giving your final classifier to give you a pretty confident estimate of how well it's doing. And again, if you have a million examples, maybe you might decide that 10,000 examples is more than enough in order to evaluate a single classifier and give you a good estimate of how well it's doing. So in this example, where you have a million examples, if you need just 10,000 for your dev and 10,000 for your test, your ratio would be more like 10,000 is 1 percent of 1 million, so you'd have 98 percent train, 1 percent dev, 1 percent test. And I've also seen applications where if you have even more than a million examples, you might end up with 99.5 percent train and 0.25 percent dev, 0.25 percent test, or maybe a 0.4 percent dev, 0.1 percent test. So just to recap, when setting up your machine learning problem, I'll often set it up into a train, dev, and test sets. And if you have a relatively small data set, these traditional ratios might be okay. But if you have a much larger data set, it's also fine to set your dev and test sets to be much smaller than 20 percent or even 10 percent of your data. We'll give more specific guidelines on the sizes of dev and test sets later in this specialization. One other trend we're seeing in the era of modern deep learning is that more and more people train on mismatched train and test distributions. Let's say you're building an app that lets users upload a lot of pictures, and your goal is to find pictures of cats in order to show your users. Maybe all your users are cat lovers. Maybe your training set comes from cat pictures downloaded off the Internet, but your dev and test sets might comprise cat pictures from users using your app. So maybe your training set has a lot of pictures crawled off the Internet, but the dev and test sets are pictures uploaded by users. It turns out a lot of web pages have very high resolution, very professional, very nicely framed pictures of cats, but maybe your users are uploading, you know, blurrier, lower res images just taken with a cell phone camera in a more casual condition. And so these two distributions of data may be different. The rule of thumb I'd encourage you to follow in this case is to make sure that the dev and test sets come from the same distribution. We'll say more about this particular guideline as well, but because you will be using the dev set to evaluate a lot of different models and trying really hard to improve performance on the dev set, it's nice if your dev set comes from the same distribution as your test set. But because deep learning algorithms have such a huge hunger for training data, one trend I'm seeing is that you might use all sorts of creative tactics such as crawling web pages in order to acquire a much bigger training set than you would otherwise have. Even if part of the cost of that is then that your training set data might not come from the same distribution as your dev and test set. But you find that so long as you follow this rule of thumb, that progress in your machine learning algorithm will be faster. And I'll give a more detailed explanation for this particular rule of thumb later in this specialization as well. Finally, it might be okay to not have a test set. Remember the goal of the test set is to give you an unbiased estimate of the performance of your final network, of the network that you selected. But if you don't need that unbiased estimate, then it might be okay to not have a test set. So what you do if you have only a dev set but not a test set is you train on the training set and then you try different model architectures, evaluate them on the dev set, and then use that to iterate and try to get to a good model. Because you've fit your data to the dev set, this no longer gives you an unbiased estimate of performance. But if you don't need one, that might be perfectly fine. In the machine learning world, when you have just a train and a dev set but no separate test set, most people will call this a training set and they will call the dev set the test set. But what they actually end up doing is using the test set as a holdout cross-validation set, which maybe isn't completely a great use of terminology because they're then overfitting to the test set. So when a team tells you that they have only a train and a test set, I would just be cautious and think, do they really have a train dev set because they're overfitting to the test set? Culturally, it might be difficult to change some of these teams' terminology and get them to call it a train dev set rather than a train test set, even though I think calling it a train and development set would be more correct terminology. And this is actually okay practice if you don't need a completely unbiased estimate of the performance of your algorithm. So having set up a train, dev, and test set will allow you to iterate more quickly. It will also allow you to more efficiently measure the bias and variance of your algorithm so you can more efficiently select ways to improve your algorithm. Let's start to talk about that in the next video.
course detail
  • Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and Optimization
  • Week 1
Next Lesson
Week 1: Practical Aspects of Deep Learning
    Setting up your Machine Learning Application
  • Train / Dev / Test sets
    Video
    ・
    12 mins
  • Bias / Variance
    Video
    ・
    8 mins
  • Basic Recipe for Machine Learning
    Video
    ・
    6 mins
  • Regularizing your Neural Network
  • Clarification about Upcoming Regularization Video
    Reading
    ・
    1 min
  • Regularization
    Video
    ・
    9 mins
  • Why Regularization Reduces Overfitting?
    Video
    ・
    7 mins
  • Dropout Regularization
    Video
    ・
    9 mins
  • Clarification about Upcoming Understanding Dropout Video
    Reading
    ・
    1 min
  • Understanding Dropout
    Video
    ・
    7 mins
  • Other Regularization Methods
    Video
    ・
    8 mins
  • Setting Up your Optimization Problem
  • Normalizing Inputs
    Video
    ・
    5 mins
  • Vanishing / Exploding Gradients
    Video
    ・
    6 mins
  • Weight Initialization for Deep Networks
    Video
    ・
    6 mins
  • Numerical Approximation of Gradients
    Video
    ・
    6 mins
  • Gradient Checking
    Video
    ・
    6 mins
  • Gradient Checking Implementation Notes
    Video
    ・
    5 mins
  • Lecture Notes (Optional)
  • Lecture Notes W1
    Reading
    ・
    1 min
  • Quiz
  • Practical Aspects of Deep Learning

    Graded・Quiz

    ・
    50 mins
  • Programming Assignments
  • (Optional) Downloading your Notebook, Downloading your Workspace and Refreshing your Workspace
    Reading
    ・
    5 mins
  • Initialization

    Graded・Code Assignment

    ・
    3 hours
  • Regularization

    Graded・Code Assignment

    ・
    3 hours
  • Gradient Checking

    Graded・Code Assignment

    ・
    3 hours
  • Heroes of Deep Learning (Optional)
  • Yoshua Bengio Interview
    Video
    ・
    25 mins
  • Certificate
    Quick Guide & Tips