When you enroll in this course, you'll also be enrolled in this Professional Certificate.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate from Microsoft
There are 5 modules in this course
Transform raw data into valuable insights using R's powerful tidyverse tools. This beginner-friendly course introduces you to essential data cleaning and manipulation techniques, making complex data tasks approachable and practical. Learn how to clean messy data, handle missing values, and prepare datasets for analysis using Microsoft's development environment and AI assistance.
Through hands-on practice, you'll master fundamental data cleaning skills while building confidence in:
- Organizing and structuring data effectively
- Handling common data issues
- Working with different data formats
- Using AI tools to enhance your workflow
- Creating reproducible data cleaning processes
Each concept is taught step-by-step with extensive examples and guided practice, ensuring you build a strong foundation in data manipulation skills.
In this module, you'll get hands-on experience with dplyr, the powerhouse package for data manipulation in R. We'll work with real retail sales data as you learn to filter, arrange, and transform your data with ease. By the end of this module, you'll be confidently writing clean, efficient code using the pipe operator and essential dplyr functions that professional data analysts use daily.
What's included
4 videos9 readings2 assignments3 ungraded labs
Show info about module content
4 videos•Total 24 minutes
Welcome to Data Manipulation and Cleaning in R•2 minutes
Understanding Tidy Data•3 minutes
Hands-on with dplyr Basics•10 minutes
Data Transformation in Action•9 minutes
9 readings•Total 185 minutes
Course Navigation and Requirements•30 minutes
Creating Your GitHub Folder for Course 2•30 minutes
Getting Started with dplyr•15 minutes
Essential dplyr Functions Guide•15 minutes
Using R in your Visual Studio Code Lab •15 minutes
Connecting Copilot in your Visual Studio Code Labs•10 minutes
Understanding Data Transformation•30 minutes
Advanced Data Transformation Techniques•30 minutes
Module Overview•10 minutes
2 assignments•Total 60 minutes
dplyr Fundamentals Assessment•30 minutes
Data Transformation Post Lab Assessment•30 minutes
3 ungraded labs•Total 180 minutes
Hands-On Activity: Practice with Retail Data•60 minutes
Hands-On Activity: Data Transformation•60 minutes
Hands-On Activity: Advanced Data Transformation Practice•60 minutes
Reshaping Data with tidyr
Module 2•4 hours to complete
Module details
Data rarely comes in the perfect format we need - and that's exactly what we'll tackle in this module. Using tidyr, you'll learn to reshape data like a pro, converting between wide and long formats, and handling complex data structures. Through practical exercises with regional sales data, you'll master the tools needed to transform messy data into clean, analysis-ready formats.
What's included
3 videos4 readings2 assignments2 ungraded labs
Show info about module content
3 videos•Total 17 minutes
Understanding Data Formats•3 minutes
Reshaping Data with tidyr•7 minutes
Implementing Column Operations•7 minutes
4 readings•Total 70 minutes
Wide and Long Data Formats Guide•15 minutes
Introduction to Column Operations•15 minutes
Advanced Column Manipulation•30 minutes
Module Overview•10 minutes
2 assignments•Total 60 minutes
Data Format Fundamentals•30 minutes
Column Operations Assessment•30 minutes
2 ungraded labs•Total 120 minutes
Data Format Transformation Practice•60 minutes
Column Operations Practice•60 minutes
String Manipulation with stringr
Module 3•6 hours to complete
Module details
Text data can be particularly challenging. In this module, you'll work with stringr to clean and standardize text data effectively. Using real product descriptions and customer data, you'll learn pattern matching and advanced string manipulation techniques that make text data cleaning a breeze. You'll see how combining stringr with dplyr creates robust solutions for complex data cleaning challenges.
What's included
2 videos7 readings2 assignments3 ungraded labs
Show info about module content
2 videos•Total 19 minutes
String Functions in Action•9 minutes
Advanced String Functions Demo•10 minutes
7 readings•Total 130 minutes
Getting Started with String Manipulation•30 minutes
Regular Expressions Guide•15 minutes
Quantifiers and Patterns in Regex•20 minutes
Advanced String Operations Overview•30 minutes
Normalizing and Transforming Strings •10 minutes
Integration with dplyr•15 minutes
Module Overview•10 minutes
2 assignments•Total 60 minutes
String Manipulation Basics•30 minutes
Advanced String Operations Post Lab Assessment•30 minutes
3 ungraded labs•Total 180 minutes
Hands-On Activity: String Functions in Action•60 minutes
Advanced String Operations Practice•60 minutes
String Manipulation Practice•60 minutes
Handling Missing Values and Duplicates
Module 4•8 hours to complete
Module details
In this module, you'll learn approaches to handling missing values, outliers, and duplicates. Working with actual order and inventory data, you'll develop strategies for maintaining data quality. You'll discover how modern AI tools can help automate your cleaning processes, making your work more efficient and consistent.
What's included
4 videos8 readings3 assignments3 ungraded labs
Show info about module content
4 videos•Total 26 minutes
Impact of Missing Data•3 minutes
Understanding Missing Values in R•7 minutes
Outlier and Duplicate Detection Demo•7 minutes
AI-Assisted Data Cleaning Demo•9 minutes
8 readings•Total 185 minutes
Missing Value Handling Guide•30 minutes
Advanced Imputation Methods for Comprehensive Data Analysis (Optional)•10 minutes
Missing Value Detection and Treatment•10 minutes
Understanding Outliers and Duplicates•30 minutes
Statistical Methods Guide•30 minutes
Introduction to AI in Data Cleaning•30 minutes
AI Tool Best Practices•30 minutes
Module Summary•15 minutes
3 assignments•Total 90 minutes
Missing Value Concepts•30 minutes
Outlier and Duplicate Handling•30 minutes
AI Tools Assessment•30 minutes
3 ungraded labs•Total 180 minutes
Missing Value Practice•60 minutes
Outlier and Duplicate Detection Practice•60 minutes
AI-Assisted Data Cleaning Practice•60 minutes
Final Project
Module 5•6 hours to complete
Module details
The comprehensive project simulates a real-world data cleaning scenario where you'll act as a data specialist tasked with standardizing a critical organizational dataset. You'll apply all the key skills learned throughout the course in a structured, step-by-step approach.
What's included
5 readings1 programming assignment2 ungraded labs
Show info about module content
5 readings•Total 110 minutes
Project Overview•30 minutes
Sample project: Project Implementation•10 minutes
[Solution Reference] Final Project•10 minutes
Module Overview•30 minutes
Course Overview and next steps•30 minutes
1 programming assignment•Total 120 minutes
Final Project Implementation •120 minutes
2 ungraded labs•Total 105 minutes
Sample Project 1 - Practice•60 minutes
Sample Project 2 - Practice•45 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Our goal at Microsoft is to empower every individual and organization on the planet to achieve more.
In this next revolution of digital transformation, growth is being driven by technology. Our integrated cloud approach creates an unmatched platform for digital transformation. We address the real-world needs of customers by seamlessly integrating Microsoft 365, Dynamics 365, LinkedIn, GitHub, Microsoft Power Platform, and Azure to unlock business value for every organization—from large enterprises to family-run businesses. The backbone and foundation of this is Azure.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Certificate?
When you enroll in the course, you get access to all of the courses in the Certificate, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.