In this course, you’ll explore three exploratory data analysis (EDA) practices: cleaning, joining, and validating. You'll discover the importance of these practices for data analysis, and you’ll use Python to clean, validate, and join data.



Clean Your Data
This course is part of Google Data Analysis with Python Specialization

Instructor: Google Career Certificates
Top Instructor
Access provided by Workforce Innovation Network (WIN) LLC
What you'll learn
Explore the EDA practices of cleaning, validating and joining data
Skills you'll gain
Details to know

Add to your LinkedIn profile
5 assignments
September 2025
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 5 modules in this course
Missing or duplicate data can appear in datasets for numerous reasons. The impact of missing values can vary depending on how many are present. In this module, you will learn strategies to address missing data entries, determine when deduplication is needed, and use common Python functions for handling duplicates.
What's included
4 videos1 reading1 assignment3 ungraded labs
Outliers are data points that stand out amongst others. A tactful approach to outliers recognizes the human stories and real-world effects they represent. In this module, you will learn the types of outliers, how to handle them, and visualize them.
What's included
2 videos2 readings1 assignment
Data models typically work better with numerical inputs. To facilitate this, categorical data is encoded into numeric digits for analysis. In this module, you will learn why this transformation is needed, what dummy variables are, and how to select the right encoding method.
What's included
2 videos2 readings1 assignment
Input validation focuses on thoroughly checking data for completeness and to eliminate errors. In this module, you will learn why validation minimizes errors, how to detect improper inputs, and why it's essential for joining datasets.
What's included
2 videos1 assignment2 ungraded labs1 plugin
Review everything you’ve learned and take the final assessment.
What's included
1 reading1 assignment
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor

Offered by
Why people choose Coursera for their career




Explore more from Data Science
Coursera Project Network
Johns Hopkins University
Corporate Finance Institute

Open new doors with Coursera Plus
Unlimited access to 10,000+ world-class courses, hands-on projects, and job-ready certificate programs - all included in your subscription
Advance your career with an online degree
Earn a degree from world-class universities - 100% online
Join over 3,400 global companies that choose Coursera for Business
Upskill your employees to excel in the digital economy