Cleaning, Reshaping, and Expanding Datasets in Python

Offered By
Coursera Project Network
In this Guided Project, you will:

Clean datasets by dropping features and variables that have low variance or single values or that are extraneous and removing outliers

Reshape Data in Python by reordering, combining, splitting, stacking, expanding, or squeezing dimensions

Implement Feature Scaling and Normalization

Clock2 hours
IntermediateIntermediate
CloudNo download needed
VideoSplit-screen video
Comment DotsEnglish
LaptopDesktop only

It has been said that obtaining and cleaning data constitutes 80% of a data scientists job. Whether it's correcting or replacing missing data, removing duplicate entries, or dealing with outliers, our datasets always require some level of cleaning and reshaping. Doing so improves the accuracy of our results immensely. In this 2 hour project-based course, we will examine a variety of methods to clean, and reshape any dataset. Note: This course works best for learners who are based in the North America region. We’re currently working on providing the same experience in other regions.

Skills you will develop

Data PreparationData Pre-ProcessingData CleansingData Reshaping

Learn step-by-step

In a video that plays in a split-screen with your work area, your instructor will walk you through these steps:

  1. Inspect and Diagnose a Dataset

  2. Dealing with Missing and Extraneous Data

  3. Reshaping, Scaling, and Normalizing Datasets

  4. Merging Datasets

  5. Joining and Concatenating datasets

How Guided Projects work

Your workspace is a cloud desktop right in your browser, no download required

In a split-screen video, your instructor guides you step-by-step

Frequently asked questions

Frequently Asked Questions

More questions? Visit the Learner Help Center.