When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 7 modules in this course
Welcome to the Capstone Project for Big Data! In this culminating project, you will build a big data ecosystem using tools and methods form the earlier courses in this specialization. You will analyze a data set simulating big data generated from a large number of users who are playing our imaginary game "Catch the Pink Flamingo". During the five week Capstone Project, you will walk through the typical big data science steps for acquiring, exploring, preparing, analyzing, and reporting. In the first two weeks, we will introduce you to the data set and guide you through some exploratory analysis using tools such as Splunk and Open Office. Then we will move into more challenging big data problems requiring the more advanced tools you have learned including KNIME, Spark's MLLib and Gephi. Finally, during the fifth and final week, we will show you how to bring it all together to create engaging and compelling reports and slide presentations. As a result of our collaboration with Splunk, a software company focus on analyzing machine-generated big data, learners with the top projects will be eligible to present to Splunk and meet Splunk recruiters and engineering leadership.
This week we provide an overview of the Eglence, Inc. Pink Flamingo game, including various aspects of the data which the company has access to about the game and users and what we might be interested in finding out.
What's included
4 videos4 readings
Show info about module content
4 videos•Total 18 minutes
Welcome to the Big Data Capstone Project•3 minutes
Welcome from Splunk: Rob Reed World Education Evangelist•3 minutes
A Summary of Catch the Pink Flamingo•7 minutes
A Conceptual Schema for Catch the Pink Flamingo•4 minutes
4 readings•Total 35 minutes
Planning, Preparation, and Review•10 minutes
A Game by Eglence Inc. : Catch The Pink Flamingo•10 minutes
Overview of the Catch the Pink Flamingo Data Model•10 minutes
Overview of Final Project Design•5 minutes
Acquiring, Exploring, and Preparing the Data
Module 2•4 hours to complete
Module details
Next, we begin working with the simulated game data by exploring and preparing the data for ingestion into big data analytics applications.
What's included
6 readings1 assignment1 peer review
Show info about module content
6 readings•Total 140 minutes
Downloading the Game Data and Associated Scripts•10 minutes
Understanding the CSV Files Generated by the Scripts•20 minutes
Optional Review of Splunk•0 minutes
“Catch the Pink Flamingo” Data Exploration with Splunk•45 minutes
Aggregate Calculations Using Splunk•45 minutes
Filtering the Data With Splunk•20 minutes
1 assignment•Total 30 minutes
Data Exploration With Splunk•30 minutes
1 peer review•Total 60 minutes
Data Exploration Technical Appendix•60 minutes
Data Classification with KNIME
Module 3•5 hours to complete
Module details
This week we do some data classification using KNIME.
What's included
4 readings1 peer review
Show info about module content
4 readings•Total 45 minutes
Review: Classification Using Decision Tree in KNIME•10 minutes
Review: Interpreting a Decision Tree in KNIME•10 minutes
Workflow Overview for Building a Decision Tree in KNIME•20 minutes
Description of combined_data.csv•5 minutes
1 peer review•Total 240 minutes
Classifying in KNIME to identify big spenders in Catch the Pink Flamingo•240 minutes
Clustering with Spark
Module 4•5 hours to complete
Module details
This week we do some clustering with Spark.
What's included
2 readings1 peer review3 discussion prompts
Show info about module content
2 readings•Total 35 minutes
Informing business strategies based on client base•5 minutes
Practice with PySpark MLlib Clustering•30 minutes
1 peer review•Total 200 minutes
Recommending Actions from Clustering Analysis•200 minutes
3 discussion prompts•Total 40 minutes
Is there only “one way” to cluster a client base?•15 minutes
How many clusters?•10 minutes
What kind of criteria might provide actionable information for Eglence Inc.?•15 minutes
Graph Analytics of Simulated Chat Data With Neo4j
Module 5•3 hours to complete
Module details
This week we apply what we learned from the 'Graph Analytics With Big Data' course to simulated chat data from Catch the Pink Flamingos using Neo4j. We analyze player chat behavior to find ways of improving the game.
What's included
2 readings1 peer review
Show info about module content
2 readings•Total 130 minutes
Understanding the Simulated Chat Data Generated by the Scripts•10 minutes
Graph Analytics of Catch the Pink Flamingo Chat Data Using Neo4j•120 minutes
1 peer review•Total 60 minutes
Graph Analytics With Chat Data Using Neo4j•60 minutes
Reporting and Presenting Your Work
Module 6•9 minutes to complete
Module details
What's included
1 video1 reading
Show info about module content
1 video•Total 2 minutes
Week 5: Bringing It All Together•2 minutes
1 reading•Total 7 minutes
Final project preparation•7 minutes
Final Submission
Module 7•4 hours to complete
Module details
What's included
1 video1 reading2 peer reviews
Show info about module content
1 video•Total 1 minute
Congratulations! Some Final Words...•1 minute
1 reading•Total 10 minutes
Part 2: Help us connect your video to your LinkedIn profile•10 minutes
UC San Diego is an academic powerhouse and economic engine, recognized as one of the top 10 public universities by U.S. News and World Report. Innovation is central to who we are and what we do. Here, students learn that knowledge isn't just acquired in the classroom—life is their laboratory.
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Learner reviews
4.4
400 reviews
5 stars
66.25%
4 stars
21.50%
3 stars
5.75%
2 stars
1.75%
1 star
4.75%
Showing 3 of 400
V
VP
5·
Reviewed on May 30, 2019
I only realized how good this specialization was when I took a course from another university.
J
JK
4·
Reviewed on Jan 6, 2021
A lot more work and time than expected. Some issues with software tools as per expected.
R
RA
5·
Reviewed on May 15, 2018
This has been excellent Learning experience.Instructor and fellow members shared their valuable information during the course of the Learning and Capstone Project phase.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.