IBM

Data Analysis with Python

Access provided by Justice Through Code at Columbia University

591,422 already enrolled

Gain insight into a topic and learn the fundamentals.
4.7

(19,407 reviews)

Intermediate level

Recommended experience

Flexible schedule
2 weeks at 10 hours a week
Learn at your own pace
94%
Most learners liked this course
Gain insight into a topic and learn the fundamentals.
4.7

(19,407 reviews)

Intermediate level

Recommended experience

Flexible schedule
2 weeks at 10 hours a week
Learn at your own pace
94%
Most learners liked this course

What you'll learn

  • Construct Python programs to clean and prepare data for analysis by addressing missing values, formatting inconsistencies, normalization, and binning

  • Analyze real-world datasets through exploratory data analysis (EDA) using libraries such as Pandas, NumPy, and SciPy to uncover patterns and insights

  • Apply data operation techniques using dataframes to organize, summarize, and interpret data distributions, correlation analysis, and data pipelines

  • Develop and evaluate regression models using Scikit-learn, and use these models to generate predictions and support data-driven decision-making

Details to know

Shareable certificate

Add to your LinkedIn profile

Assessments

11 assignments¹

AI Graded see disclaimer
Taught in English

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is available as part of
When you enroll in this course, you'll also be asked to select a specific program.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

There are 6 modules in this course

In this module, you will develop foundational skills in Python-based data analysis by learning how to understand and prepare datasets, utilize essential Python packages, and import and export data for analysis. You’ll gain hands-on experience using tools like Pandas, Numpy, and SQLite to begin analyzing real-world datasets, including a laptop pricing dataset. In addition, you’ll be provided with a cheat sheet that serves as a handy reference throughout this learning journey.

What's included

6 videos1 reading2 assignments2 app items3 plugins

In this module, you will enhance your data wrangling skills using Python by learning techniques to clean, transform, and prepare data for analysis. You’ll work with real-world datasets to handle missing values, format and normalize data, bin numerical values, and convert categorical variables. Through guided labs, you’ll apply these skills to both the Laptop and Used Car Pricing datasets. You will also receive a cheat sheet to support you as a quick reference throughout the learning process.

What's included

6 videos1 reading2 assignments2 app items1 plugin

In this module, you will build essential skills in exploratory data analysis (EDA) using Python. You will learn to perform computations on the data to calculate basic descriptive statistical information, such as mean, median, mode, and quartile values, and use that information to better understand the distribution of the data. You will learn how to group data to better visualize patterns, use the Pearson correlation method to compare two continuous numerical variables, and apply the chi-square test to assess associations between categorical variables and interpret the results. Further, you will be provided with a cheat sheet that will serve as a quick reference for commonly used EDA functions and methods.

What's included

5 videos1 reading2 assignments2 app items3 plugins

In this module, you will explore the fundamentals of model development in data analysis using Python. You’ll learn how to build, visualize, and evaluate different types of regression models, including simple linear, multiple linear, and polynomial regression models, along with pipelines to streamline your workflows. You’ll also interpret model performance using key metrics and visual tools such as kernel density estimation (KDE) plots. Hands-on labs will reinforce your learning with practical datasets like used car and laptop pricing. Additionally, the cheat sheet will serve as a quick reference for building and evaluating predictive models.

What's included

6 videos1 reading2 assignments2 app items2 plugins

In this module, you will refine your predictive modeling skills by learning how to evaluate, tune, and select models for optimal performance. You’ll explore concepts such as overfitting, underfitting, and hyperparameter tuning using grid search. You will also learn about using ridge regression to regularize and reduce standard errors to prevent overfitting a regression model. Through hands-on labs, you'll apply these techniques to real datasets to build robust, generalizable models. A cheat sheet is included to guide you in choosing the right tools and metrics for model optimization.

What's included

4 videos1 reading2 assignments2 app items2 plugins

In this final module, you will apply the complete data analysis workflow, from importing and cleaning data to building and evaluating models on real-world datasets. You’ll complete a hands-on practice project and a peer-reviewed final project based on datasets related to insurance costs and house pricing. For the final project, you will take on the role of a Data Analyst at a real estate investment trust looking to invest in residential properties. You’ll work with a dataset containing detailed information on house prices and various property features, and your task will be to analyze the data and predict housing market values. These projects are designed to consolidate your skills and prepare you for real-world data analysis challenges. Finally, you will demonstrate comprehension and application of key data analysis concepts through a final exam.

What's included

5 readings1 assignment1 peer review2 app items1 plugin

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructor

Instructor ratings
4.6 (3,242 ratings)
Joseph Santarcangelo
IBM
36 Courses2,193,412 learners

Offered by

IBM

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Learner reviews

4.7

19,407 reviews

  • 5 stars

    76.37%

  • 4 stars

    18.23%

  • 3 stars

    3.65%

  • 2 stars

    0.93%

  • 1 star

    0.80%

Showing 3 of 19407

RM
4

Reviewed on Jun 9, 2020

ND
4

Reviewed on Jul 30, 2021

AA
5

Reviewed on Dec 8, 2021

Explore more from Data Science

¹ Some assignments in this course are AI-graded. For these assignments, your data will be used in accordance with Coursera's Privacy Notice.