This 90-minute guided-project, "Pyspark for Data Science: Customer Churn Prediction," is a comprehensive guided-project that teaches you how to use PySpark to build a machine learning model for predicting customer churn in a Telecommunications company. This guided-project covers a range of essential tasks, including data loading, exploratory data analysis, data preprocessing, feature preparation, model training, evaluation, and deployment, all using Pyspark. We are going to use our machine learning model to identify the factors that contribute to customer churn, providing actionable insights to the company to reduce churn and increase customer retention. Throughout the guided-project, you'll gain hands-on experience with different steps required to create a machine learning model in Pyspark, giving you the tools to deliver an AI-driven solution for customer churn. Prerequisites for this guided-project include basic knowledge of Machine Learning and Decision Trees, as well as familiarity with Python programming concepts such as loops, if statements, and lists.



Machine Learning with PySpark: Customer Churn Analysis

Instructor: Ahmad Varasteh
Access provided by EDGE Group
2,886 already enrolled
(23 reviews)
Recommended experience
What you'll learn
- Use AI driven solution to solve a business problem 
- Build a machine learning model with PySpark 
- Apply data cleansing activities using PySpark 
Skills you'll practice
Details to know

Add to your LinkedIn profile
Only available on desktop
See how employees at top companies are mastering in-demand skills

Learn, practice, and apply job-ready skills in less than 2 hours
- Receive training from industry experts
- Gain hands-on experience solving real-world job tasks
- Build confidence using the latest tools and technologies

About this Guided Project
Learn step-by-step
In a video that plays in a split-screen with your work area, your instructor will walk you through these steps:
- Set up the project environment (11 min) 
- Exploratory Data Analysis Part I - Numerical Columns (10 min) 
- Exploratory Data Analysis Part II - Categorical Columns (10 min) 
- Preprocess and clean data (7 min) 
- Demonstrate your understanding of Data Exploration and Preprocessing (5 min) 
- Prepare the input data for your model Part I - Numerical Features (6 min) 
- Prepare the input data for your model Part II - Categorical Features (10 min) 
- Train your decision tree (9 min) 
- Evaluate your model (11 min) 
- Deploy your model (6 min) 
- Challenge Activity: Employee Attrition Prediction (6 min) 
Recommended experience
Basic knowledge of Machine Learning and Decision Trees, Python programming language (basic concepts such as: loops, if statements and lists)
11 project images
Instructor

Offered by
How you'll learn
- Skill-based, hands-on learning - Practice new skills by completing job-related tasks. 
- Expert guidance - Follow along with pre-recorded videos from experts using a unique side-by-side interface. 
- No downloads or installation required - Access the tools and resources you need in a pre-configured cloud workspace. 
- Available only on desktop - This Guided Project is designed for laptops or desktop computers with a reliable Internet connection, not mobile devices. 
Why people choose Coursera for their career




Learner reviews
23 reviews
- 5 stars65.21% 
- 4 stars30.43% 
- 3 stars4.34% 
- 2 stars0% 
- 1 star0% 
Showing 3 of 23
Reviewed on Jan 25, 2025
Good overview to understand basic codes for pyspark queries.
Reviewed on Feb 5, 2025
Excellent course covering basic ML methods and using intermediate techniques to solve ML related problems with PySpark
Reviewed on Jun 28, 2023
Explanation is very clear and easy to understand. Well structure.Many thanks.
You might also like
 - Edureka 
 - Coursera Project Network 
 - Edureka 


