Dartmouth College

Predictive Analytics

Reed H. Harder
Vikrant S. Vaze

Instructors: Reed H. Harder

Access provided by Ceibal

Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

5 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace
Gain insight into a topic and learn the fundamentals.
Intermediate level

Recommended experience

5 weeks to complete
at 10 hours a week
Flexible schedule
Learn at your own pace

See how employees at top companies are mastering in-demand skills

 logos of Petrobras, TATA, Danone, Capgemini, P&G and L'Oreal

Build your subject-matter expertise

This course is part of the Data Analytics for Digital Transformation Specialization
When you enroll in this course, you'll also be enrolled in this Specialization.
  • Learn new concepts from industry experts
  • Gain a foundational understanding of a subject or tool
  • Develop job-relevant skills with hands-on projects
  • Earn a shareable career certificate

There are 9 modules in this course

Welcome to Predictive Analytics for Digital Transformation. This course is part of the Digital Transformation for Data Analytics Certificate. It is designed to equip you with the tools and knowledge to transform raw data into actionable insights. Whether you want to enhance organizational efficiency, improve customer experiences, or innovate within your field, this course provides the foundational skills to leverage predictive analytics effectively. Throughout this course, you will explore the theoretical underpinnings and practical applications of predictive analytics, starting with linear and logistic regression and advancing to more complex models and techniques. Using Python and cloud-based tools, you'll gain hands-on experience in building, training, and evaluating models that solve real-world business challenges. Topics include diagnosing model performance issues like overfitting and underfitting, selecting appropriate features, and working with skewed datasets. You’ll also explore advanced modeling techniques and cross-validation methods to ensure your models are generalizable and robust. Guided by Drs. Vikrant Vaze and Reed Harder, you’ll complete practical activities, reflection exercises, and case-based projects designed to simulate real-world scenarios. Along the way, you’ll learn to integrate analytics into digital transformation initiatives, empowering you to lead data-driven innovations in your industry. Whether you're a seasoned professional or new to the field, this course will challenge you to think critically, code effectively, and apply your skills to meaningful, data-centric problems.

What's included

2 videos10 readings2 assignments2 ungraded labs

You may have heard the analogies “Data is the new oil,” and “Analytics is the combustion engine.” What is meant by these comparisons? In the digital transformation era, traditional companies seek to gather, refine, and mathematically study all kinds of available information, from customer demographics to operational metrics, to reimagine business models and processes for the 21st century. Indeed, quality data is the fuel that drives organizational decision-making! In this module, you will get hands-on practice with two key predictive analytics tools, regression, and classification, and learn how to create mathematical models representative of business situations. We’ll conclude with instructions on implementing the models in code using Scikit-learn, a common Python library for machine learning.

What's included

2 videos5 readings2 assignments2 ungraded labs

In this unit, we build our first predictive model using linear regression, a fundamental and powerful method in supervised learning. To illustrate its application, we return to our airfare prediction example: an airline collects historical data to predict the average airfare for a new route based on its distance. We aim to determine the line that best fits the data, minimizing the difference between predicted and actual fares. This process introduces the model training objective, where we optimize parameters (e.g., slope and intercept) by minimizing an error function. We'll explore how the gradient descent algorithm, a versatile and iterative optimization method, achieves this. Linear regression is a cornerstone of digital transformation, enabling organizations to derive actionable insights from data. For example, the healthcare industry can predict patient outcomes based on variables like age, medical history, and treatment options, driving more personalized care. Similarly, businesses can forecast sales in retail based on historical purchasing trends, inventory levels, and seasonal factors, enabling smarter supply chain management. Industries are transforming operations, decision-making, and customer experiences by integrating models like this.

What's included

3 videos6 readings2 assignments4 ungraded labs

In this unit, we deepen our understanding of predictive analytics by exploring more complex models and concepts that enhance decision-making through digital transformation. Building on our foundation in linear regression, we will expand into multivariate linear regression, where multiple features contribute to predictions, reflecting the multifaceted nature of real-world data. We will also introduce classification models, which predict discrete outcomes rather than continuous ones. Using practical examples like hospital readmission prediction, we’ll see how these models can address critical questions such as whether a patient is likely to be readmitted. Additional scenarios, such as predicting flight delays, customer behavior, or tumor diagnoses, will further demonstrate the power of classification. To effectively build and refine these models, we will introduce three essential concepts: feature transformation, feature selection, and overfitting. These techniques help answer pivotal questions about which features to include, how to transform data for optimal results, and how to avoid overly complex models that fail to generalize. We will understand the trade-offs and risks in creating robust supervised learning models by applying these ideas to previously explored examples, such as airfare prediction and hospital readmissions.

What's included

4 videos7 readings2 assignments4 ungraded labs

In this unit, we bridge the gap between foundational predictive analytics and practical implementation in modern digital transformation contexts. We begin by exploring Predictive Analytics in Python, where we leverage Python’s powerful libraries to build, train, and evaluate regression and classification models. Through hands-on exercises, you’ll learn how to process data, apply linear and logistic regression, and visualize results effectively. Next, we extend our focus to Linear Regression on the Cloud, demonstrating how cloud platforms enable scalable, efficient training of regression models on large datasets. You’ll gain practical experience in using cloud-based tools and services to handle real-world data challenges, such as forecasting trends and optimizing resource allocation. We also delve into Logistic Regression on the Cloud, emphasizing its applications in predicting discrete outcomes. By hosting and training logistic regression models in a cloud environment, we unlock the ability to process high-volume, real-time data, essential for tasks like customer behavior prediction, fraud detection, and healthcare analytics. Throughout the unit, we’ll highlight the role of predictive analytics in digital transformation, showing how cloud computing and Python empower organizations to make data-driven decisions.

What's included

2 videos3 readings2 assignments3 ungraded labs

Now that you are able to translate various business situations into predictive analytics models, the next challenge is to choose which model will best perform for the task at hand. Model choices may vary depending on the nature of your project, such as the requirements and constraints of the stakeholders, the time and resources available, and the availability of data. In this module, we will introduce more advanced modeling techniques which will aid the effective use of different kinds of datasets, and allow you to evaluate and improve your models in a way that incorporates the risk and uncertainty that is inherent in any real-world situation. Hands-on practice in Python to implement these advanced models will enhance your coding skills.

What's included

3 videos6 readings2 assignments3 ungraded labs

In this unit, we bring together the key concepts and techniques of predictive analytics to build robust, generalizable models that address real-world challenges. We start by diagnosing two critical issues—overfitting and underfitting—which can significantly impact a model’s performance. Using diagnostic tools, we will explore how to systematically identify and mitigate these problems to enhance model accuracy and reliability. Next, we introduce cross-validation, a powerful method to ensure models perform well on unseen data. By dividing data into training, validation, and test sets, we’ll learn how to make informed decisions about features, model complexity, and regularization parameters. This approach ensures that the predictive analytics models we develop are optimized for generalizability, a key requirement for leveraging digital transformation technologies effectively. We also tackle the challenges posed by skewed datasets, especially in classification problems with binary labels. Through practical examples, we’ll understand why standard metrics like misclassification error may fall short in such scenarios. To address this, we’ll introduce more nuanced evaluation metrics—precision, recall, and F-score—and demonstrate how to balance these measures by adjusting decision thresholds. By the end of this unit, you’ll have a comprehensive understanding of how to diagnose and refine predictive analytics models.

What's included

3 videos6 readings2 assignments4 ungraded labs

In this unit, we take a significant step forward in predictive analytics by exploring the application of neural networks for both regression and classification tasks. Neural networks offer powerful capabilities for capturing complex patterns in data, making them an essential tool for digital transformation across industries. By leveraging the scalability and efficiency of cloud platforms, we will learn to build, train, and evaluate neural network models capable of addressing real-world challenges. We begin by revisiting familiar datasets, such as expanded versions of the readmission dataset for classification and the market size dataset for regression. Through these examples, we’ll explore how neural networks handle continuous and discrete predictions, enabling us to address diverse business problems. To optimize model performance, we’ll incorporate techniques like cross-validation to fine-tune hyperparameters, such as regularization, and understand how these choices impact the trade-off between underfitting and overfitting. The unit also introduces advanced diagnostics to evaluate model performance, using metrics such as mean absolute error and mean squared error for regression, and confusion matrices for classification tasks with skewed datasets. These metrics help us refine our models and ensure their robustness. Additionally, we will tackle common data preparation challenges, such as handling missing or erroneous data, merging datasets, and transforming categorical variables into usable formats. Finally, we delve into practical examples, such as predicting flight delays, to illustrate the end-to-end workflow of cleaning, processing, and modeling data with neural networks. By iterating through models of varying complexity—linear, quadratic, and cubic—we’ll identify how to balance complexity and generalizability to avoid overfitting. By the end of this unit, you will have a comprehensive understanding of how to implement and evaluate neural networks in cloud environments, preparing you to harness their full potential for data-driven decision-making.

What's included

2 videos2 readings2 assignments3 ungraded labs

The final unit of this course is a practicum that serves as a mini-capstone project, allowing you to consolidate your learning and demonstrate mastery of the tools and techniques introduced throughout the course. This project is your opportunity to apply predictive analytics, cloud-based tools, and data science methodologies to a practical business problem, providing actionable insights that align with digital transformation initiatives. You will select a dataset and problem of interest—either from your own professional or academic context or from one of the structured scenarios provided. Using your analytics toolbox, you will explore, analyze, and develop a data-driven solution to inform strategic and operational decisions. This hands-on project will challenge you to: -Frame a business problem in terms of predictive analytics. -Develop and evaluate models, leveraging tools like Scikit-learn, neural networks, or optimization techniques. -Diagnose model performance, validate results, and provide implementable recommendations. -Translate your findings into a technical report with a comprehensive executive summary tailored to stakeholders.

What's included

4 readings2 assignments2 ungraded labs

Earn a career certificate

Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.

Instructors

Reed H. Harder
Dartmouth College
6 Courses1,517 learners
Vikrant S. Vaze
Dartmouth College
5 Courses2,134 learners

Offered by

Why people choose Coursera for their career

Felipe M.
Learner since 2018
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."

Explore more from Data Science