Back to AI Workflow: Data Analysis and Hypothesis Testing

AI Workflow: Data Analysis and Hypothesis Testing

This is the second course in the IBM AI Enterprise Workflow Certification specialization. You are STRONGLY encouraged to complete these courses in order as they are not individual independent courses, but part of a workflow where each course builds on the previous ones. In this course you will begin your work for a hypothetical streaming media company by doing exploratory data analysis (EDA). Best practices for data visualization, handling missing data, and hypothesis testing will be introduced to you as part of your work. You will learn techniques of estimation with probability distributions and extending these estimates to apply null hypothesis significance tests. You will apply what you learn through two hands on case studies: data visualization and multiple testing using a simple pipeline. By the end of this course you should be able to: 1. List several best practices concerning EDA and data visualization 2. Create a simple dashboard in Watson Studio 3. Describe strategies for dealing with missing data 4. Explain the difference between imputation and multiple imputation 5. Employ common distributions to answer questions about event probabilities 6. Explain the investigative role of hypothesis testing in EDA 7. Apply several methods for dealing with multiple testing Who should take this course? This course targets existing data science practitioners that have expertise building machine learning models, who want to deepen their skills on building and deploying AI in large enterprises. If you are an aspiring Data Scientist, this course is NOT for you as you need real world expertise to benefit from the content of these courses. What skills should you have? It is assumed that you have completed Course 1 of the IBM AI Enterprise Workflow specialization and have a solid understanding of the following topics prior to starting this course: Fundamental understanding of Linear Algebra; Understand sampling, probability theory, and probability distributions; Knowledge of descriptive and inferential statistical concepts; General understanding of machine learning techniques and best practices; Practiced understanding of Python and the packages commonly used in data science: NumPy, Pandas, matplotlib, scikit-learn; Familiarity with IBM Watson Studio; Familiarity with the design thinking process.

Status: Statistical Analysis

Status: Pandas (Python Package)

AdvancedCourse11 hours

Featured reviews

5.0Reviewed Apr 2, 2020

More practicality and assignment should me there. Which is more helpful for the learners.

5.0Reviewed Jul 6, 2020

Very Informative and Labs for Hands-on session was useful.

5.0Reviewed Apr 15, 2026

This course provided a clear and structured introduction to data analysis and hypothesis testing within an AI workflow.

All reviews

Showing: 19 of 19

Olivier Roncalez

1.0

Reviewed May 6, 2020

Quizzes mark you as correct even if you're not, the answer keys are missing from notebooks, the material briefly glosses over important concepts with no depth at all. Were these issues addressed, this course would be excellent, but it sorely lacks because of it.

Jonathan Venezia

1.0

Reviewed May 27, 2020

Instructors are completely absent and ignore questions from students, vital course materials are missing, typos everywhere. This series of courses from IBM have been terrible and are of much lower quality than other e-learning offerings.

Pralay Maity

5.0

Reviewed Apr 3, 2020

More practicality and assignment should me there. Which is more helpful for the learners.

Mahjube Chavoshi

3.0

Reviewed May 18, 2020

most of the content is in text format

Youssef Nasri (ADNOC Refining - Site Integration)

5.0

Reviewed Apr 16, 2026

This course provided a clear and structured introduction to data analysis and hypothesis testing within an AI workflow.

Rangarajan me16s058

5.0

Reviewed Jul 7, 2020

Very Informative and Labs for Hands-on session was useful.

PERAM MAHENDRA REDDY

5.0

Reviewed Aug 22, 2024

learning this course helps me a lot thank you

Vaibhav Kumar

5.0

Reviewed Sep 12, 2022

Excellent course

Rafail Mahammadli

5.0

Reviewed Oct 5, 2020

Great

Théophile Pace

4.0

Reviewed Apr 29, 2021

The part on the EDA is very insightful, I learned a lot about data manipulation in Pandas. The hypothesis testing was very short so I didn't expect much. Still interesting to know about multiple hypothesis testing. The part with IBM Watson is very out of context, I don't think it makes sense in the context of this class.

Thanks for the good quality material otherwise.

SALVADOR LINARES MORCILLO

4.0

Reviewed Sep 15, 2020

Es necesario leer las referencias en los temas, porque con sólo el contenido del tema es complicado entender

Shoaib Qureshi

4.0

Reviewed Dec 13, 2020

Very detailed course.

Dang Ha Gia Huy

4.0

Reviewed Apr 17, 2025

very usefull with me

BHAVANA gubbi

3.0

Reviewed Aug 30, 2020

This course is more helpful for math geeks as most of the discussions on 2nd week are completely oriented around maths. It is tough to follow 2nd week module for someone who doesn't have sound math background like me.

Brunello Bonanni

3.0

Reviewed Apr 24, 2021

The Watson Studio info are not clear or not updated.

I had to work locally!!

Would you please fix these point as soon as possible to allow the students to use the content on cloud!

Pertti Viitamaki

3.0

Reviewed Aug 13, 2020

Last excercise would need some more explanation.

There are SO MANY misspellings in the texts by the way...

SUPARNA CHATTERJEE