Learn Data Science Fundamentals
Drive real world impact with a four-course introduction to data science.
About This Specialization
Follow the suggested order or choose your own.
Designed to help you practice and apply the skills you learn.
Highlight your new skills on your resume or LinkedIn.
- Beginner Specialization.
- No prior experience required.
Data Management and VisualizationCurrent session: Jun 26 — Jul 31.
- 4 weeks of study, 4-5 hours/week
About the CourseWhether being used to customize advertising to millions of website visitors or streamline inventory ordering at a small restaurant, data is becoming more integral to success. Too often, we’re not sure how use data to find answers to the questions that will make us more successful in what we do. In this course, you will discover what data is and think about what questions you have that can be answered by the data – even if you’ve never thought about data before. Based on existing data, you will learn to develop a research question, describe the variables and their relationships, calculate basic statistics, and present your results clearly. By the end of the course, you will be able to use powerful data analysis tools – either SAS or Python – to manage and visualize your data, including how to deal with missing data, variable groups, and graphs. Throughout the course, you will share your progress with others to gain valuable feedback, while also learning how your peers use data to answer their own questions.
Data Analysis ToolsCurrent session: Jun 26 — Jul 31.
About the CourseIn this course, you will develop and test hypotheses about your data. You will learn a variety of statistical tests, as well as strategies to know how to apply the appropriate one to your specific data and question. Using your choice of two powerful statistical software packages (SAS or Python), you will explore ANOVA, Chi-Square, and Pearson correlation analysis. This course will guide you through basic statistical principles to give you the tools to answer questions you have developed. Throughout the course, you will share your progress with others to gain valuable feedback and provide insight to other learners about their work.
Regression Modeling in PracticeUpcoming session: Jun 30 — Aug 7.
- 4 weeks, 4 - 5 hours per week
About the CourseThis course focuses on one of the most important tools in your data analysis arsenal: regression analysis. Using either SAS or Python, you will begin with linear regression and then learn how to adapt when two variables do not present a clear linear relationship. You will examine multiple predictors of your outcome and be able to identify confounding variables, which can tell a more compelling story about your results. You will learn the assumptions underlying regression analysis, how to interpret regression coefficients, and how to use regression diagnostic plots and other tools to evaluate the quality of your regression model. Throughout the course, you will share with others the regression models you have developed and the stories they tell you.
Machine Learning for Data AnalysisCurrent session: Jun 26 — Jul 31.
About the CourseAre you interested in predicting future outcomes using your data? This course helps you do just that! Machine learning is the process of developing, testing, and applying predictive algorithms to achieve this goal. Make sure to familiarize yourself with course 3 of this specialization before diving into these machine learning concepts. Building on Course 3, which introduces students to integral supervised machine learning concepts, this course will provide an overview of many additional concepts, techniques, and algorithms in machine learning, from basic classification to decision trees and clustering. By completing this course, you will learn how to apply, test, and interpret machine learning algorithms as alternative methods for addressing your research questions.
Data Analysis and Interpretation CapstoneUpcoming session: Aug 14 — Sep 18.
About the Capstone ProjectThe Capstone project will allow you to continue to apply and refine the data analytic techniques learned from the previous courses in the Specialization to address an important issue in society. You will use real world data to complete a project with our industry and academic partners. For example, you can work with our industry partner, DRIVENDATA, to help them solve some of the world's biggest social challenges! DRIVENDATA at www.drivendata.org, is committed to bringing cutting-edge practices in data science and crowdsourcing to some of the world's biggest social challenges and the organizations taking them on. Or, you can work with our other industry partner, The Connection (www.theconnectioninc.org) to help them better understand recidivism risk for people on parole seeking substance use treatment. For more than 40 years, The Connection has been one of Connecticut’s leading private, nonprofit human service and community development agencies. Each month, thousands of people are assisted by The Connection’s diverse behavioral health, family support and community justice programs. The Connection’s Institute for Innovative Practice was created in 2010 to bridge the gap between researchers and practitioners in the behavioral health and criminal justice fields with the goal of developing maximally effective, evidence-based treatment programs. A major component of the Capstone project is for you to be able to choose the information from your analyses that best conveys results and implications, and to tell a compelling story with this information. By the end of the course, you will have a professional quality report of your findings that can be shown to colleagues and potential employers to demonstrate the skills you learned by completing the Specialization.