DeepLearning.AI
AI is the new electricity and will transform and improve nearly all areas of human lives.

Quick Guide & Tips

💻   Accessing Utils File and Helper Functions

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.


💻   Downloading Notebooks

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Download as"

3:   Then, click on "Notebook (.ipynb)"


💻   Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.


📗   See Your Progress

Once you enroll in this course—or any other short course on the DeepLearning.AI platform—and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).


📱   Features to Use

🎞   Adjust Video Speed: Click on the gear icon (⚙) on the video and then from the Speed option, choose your desired video speed.

🗣   Captions (English and Spanish): Click on the gear icon (⚙) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

🔅   Video Quality: If you do not have access to high-speed internet, click on the gear icon (⚙) on the video and then from Quality, choose the quality that works the best for your Internet speed.

🖥   Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

√   Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.


🧑   Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

🧑   Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

📅   Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

☕   Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

💬   Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

✍   Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.


📚   Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. 👇

👉👉 🔗 DeepLearning.AI – All Short Courses [+]


🙂   Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community 👉👉 🔗 DeepLearning.AI Forum


Sign in

Create Your Account

Or, sign up with your email
Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Learning Path

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.

View All Plans and Features

Manage Your Learning Plan

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.

Welcome back!

Hi ,

We'd like to know you better so we can create more relevant courses. What do you do for work?

Course Syllabus

DeepLearning.AI
    daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass

    Elevate Your Career with Full Learning Experience

    Unlock Plus AI learning and gain exclusive insights from industry leaders

    Access exclusive features like graded notebooks and quizzes
    Earn unlimited certificates to enhance your resume
    Starting at $1 USD/mo after a free trial – cancel anytime
Welcome to Data I.O. and Pre-processing with Python and SQL, the fourth course in the Data Analytics Professional Certification from DeepLearn.ai. In this course, you learn how to collect data through web striping and APIs, allowing you to assemble your own datasets. Up until now, in this program, you've been working with clean datasets that have been provided to you. When you're a data analyst, you often need to collect your own data and process it so it's suitable for your analysis tasks. According to one consultant report, don't know how accurate these things are, but they are some indication, poor quality data costs businesses about $3 trillion per year in the United States alone. And on the flip side, what this means is, I think there are great opportunities to add value to business through your data collection and pre-processing skills. In addition to learning how to go and get data, you tackle common data processing tasks used to get your data ready for analysis, like removing duplicates, handling outliers, and normalizing data to a common scale. With that, I'm thrilled to welcome back your instructor, Sean Barnes, who's here to share these materials with you. Thanks, Andrew. It's really great to be here. As you know, a core challenge data analysts face is finding high quality data and transforming it for their use case. Pre-processing is its own unique skill. Although it's so important, it's less flashy than analysis, so it can sometimes be a little underappreciated. However, it provides the foundation for all your insights. It's a common saying that 80% of a data analyst's work is pre-processing. Actually, my team developed a model recently to predict movie production timelines where we spent several months in pre-processing. As someone who works with a ton of data in the field of AI, how are you seeing data collection and pre-processing evolve? Both machine learning and many of the analytics tasks you perform through this program benefit significantly from good quality data. This may mean sophisticated pre-processing or a step as simple as removing missing values. Or now with large language models, I've also seen teams do really clever work on synthetic data generation for a particular OM if that data doesn't already exist or if it's too costly to collect from external sources. In many real-world analytics and AI systems, the code that performs the core machine learning tasks, like maybe in a factory, seeing if a phone is stretched, that machine learning code is only a small fraction of the overall code. And data collection and pre-processing will sometimes be, frankly, a lot more wiser code than the core analysis part. So when you're getting started with a new analysis project, I'd encourage you to get your hands dirty exploring the data so you really understand it yourself. That's really interesting. You learn a lot from immersing yourself in the data. In this course, you'll get started right away with web scraping, collecting data from websites using Python. In the next module, you'll collect data from the web using the API, a special type of web service that allows programs to communicate with each other. You'll use text and numerical processing techniques to get that data into an analyzable format. In the final two modules, you'll learn the fundamentals of databases, including writing queries with SQL to filter, group, and join your data. You'll analyze all kinds of real-world data, including music tracks, Lego sets, food safety inspections, tech jobs, and more. You'll also use a large language model to help plan tasks and code in both SQL and Python. Data collection and pre-processing require a lot of context that isn't always easy to provide to an LLM. In the movie business, you work with a lot of unique media file types like OCF, IMF, and ProRes. Whatever domain you're working in will have its own language that an LLM may not be deeply familiar with. So Andrew, where do you think AI fits into the data analytics toolkit in terms of data collection and pre-processing? So in software development, there's a concept of the 10x developer, meaning someone that has 10 times the impact of maybe the average developer. This doesn't mean they write code 10 times faster or type 10 times faster. Rather, it means that decision making and prioritization leads to much more efficient output and impact. I think that as more jobs become AI-enabled, there'll be more and more 10x professionals with data analysts being one of the job functions that will benefit a lot from AI use. Even now, for a data analyst using AI tools, you might easily, I think, have two times the impact of other data analysts that are not yet as proficient with these tools. And even while you're using these wonderful AI tools, you still need to understand the problem context, design the output, and so on in order to effectively drive the technology. So the skills you learn in this course will allow you to accelerate your output and your productivity with AI. So that's a lot of exciting things to learn, and with that, let's head to the next video to get started.
course detail
  • Data Analytics
  • Data I/O and Preprocessing with Python and SQL
  • Module 1
Next Lesson
Module 1: Web scraping & text preprocessing
    Introduction
  • Welcome to this course!
    Video
    ・
    4 mins
  • Generative AI in this course
    Video
    ・
    2 mins
  • Module 1 introduction
    Video
    ・
    1 min
  • Data input
  • The many sources of data
    Video
    ・
    3 mins
  • Data cleaning and processing
    Video
    ・
    3 mins
  • ETL and ELT
    Video
    ・
    4 mins
  • Lesson 1 quiz
    Practice Quiz
    ・
    10 mins
  • Web scraping
  • Introduction to web scraping
    Video
    ・
    3 mins
  • Scraping tables with Pandas
    Video
    ・
    3 mins
  • Module 1 lecture code
    Code Example
    ・
    30 mins
  • String methods: replace
    Video
    ・
    4 mins
  • Casting
    Video
    ・
    2 mins
  • Handling missing values
    Video
    ・
    4 mins
  • String methods: contains
    Video
    ・
    3 mins
  • String methods: split and strip
    Video
    ・
    3 mins
  • Lesson 2 quiz
    Practice Quiz
    ・
    10 mins
  • Practice Lab: Web Scraping with Pandas
    Code Example
    ・
    30 mins
  • Scraping with Beautiful Soup
  • Networking
    Video
    ・
    3 mins
  • Scraping webpages with requests
    Video
    ・
    3 mins
  • HTML
    Video
    ・
    5 mins
  • Planning HTML parsing
    Video
    ・
    2 mins
  • Parsing HTML with Beautiful Soup
    Video
    ・
    4 mins
  • DataFrame setup
    Video
    ・
    4 mins
  • Regular expressions
    Video
    ・
    4 mins
  • Writing regular expressions with LLMs
    Video
    ・
    2 mins
  • The ethics of web scraping
    Video
    ・
    3 mins
  • Lesson 3 quiz
    Practice Quiz
    ・
    30 mins
  • Practice Lab: Web Scraping with Beautiful Soup
    Code Example
    ・
    30 mins
  • Additional Web Scraping Practice
    Reading
    ・
    10 mins
  • Graded Quiz
  • Module 1 quiz

    Graded・Quiz

    ・
    30 mins
  • Graded Lab
  • Analyzing Tech Industry Jobs and Companies

    Graded・Code Assignment

    ・
    1 hour 30 mins
  • Lecture Notes (Optional)
  • Module 1 lecture notes
    Reading
    ・
    1 min
  • Course Feedback
  • Forum
  • Certificate