When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 6 modules in this course
Once you’ve identified a big data issue to analyze, how do you collect, store and organize your data using Big Data solutions? In this course, you will experience various data genres and management tools appropriate for each. You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools. Through guided hands-on tutorials, you will become familiar with techniques using real-time and semi-structured data examples. Systems and tools discussed include: AsterixDB, HP Vertica, Impala, Neo4j, Redis, SparkSQL. This course provides techniques to extract value from existing untapped data sources and discovering new data sources.
At the end of this course, you will be able to:
* Recognize different data elements in your own work and in everyday life problems
* Explain why your team needs to design a Big Data Infrastructure Plan and Information System Design
* Identify the frequent data operations required for various types of data
* Select a data model to suit the characteristics of your data
* Apply techniques to handle streaming data
* Differentiate between a traditional Database Management System and a Big Data Management System
* Appreciate why there are so many data management systems
* Design a big data information system for an online game company
This course is for those new to data science. Completion of Intro to Big Data is recommended. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments. Refer to the specialization technical requirements for complete hardware and software specifications.
Hardware Requirements:
(A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size.
Software Requirements:
This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge (except for data charges from your internet provider). Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.
Welcome to this course on big data modeling and management. Modeling and managing data is a central focus of all big data projects. In these lessons we introduce you to the concepts behind big data modeling and management and set the stage for the remainder of the course.
What's included
14 videos8 readings3 discussion prompts
Show info about module content
14 videos•Total 63 minutes
Welcome to Big Data Modeling and Management•3 minutes
Why is this a New Course in the Big Data Specialization?•1 minute
Summary of Introduction to Big Data (Part 1)•6 minutes
Summary of Introduction to Big Data (Part 2)•5 minutes
Summary of Introduction to Big Data (Part 3)•6 minutes
Big Data Management "Must-Ask Questions"•1 minute
Data Ingestion•5 minutes
Data Storage•3 minutes
Data Quality•2 minutes
Data Operations•3 minutes
Data Scalability and Security•3 minutes
Energy Data Management Challenges at ConEd•5 minutes
Gaming Industry Data Management: Q&A with Apmetrix CTO Mark Caldwell•7 minutes
Flight Data Management at FlightStats: A Lecture by CTO Chad Berkley•13 minutes
8 readings•Total 80 minutes
Slides: Summary of Introduction to Big Data•10 minutes
Slides: Big Data Management•10 minutes
Reading on Storage Systems•10 minutes
Slides: Energy Data Management Challenges at ConEd•10 minutes
Slides: Flight Data Management at FlightStats•10 minutes
Downloading and Installing Docker Desktop Instructions•10 minutes
Instructions for Downloading Hands On Datasets•10 minutes
Basic terminal shell commands•10 minutes
3 discussion prompts•Total 30 minutes
Getting to know you: Tell us about yourself and why you are taking this course•10 minutes
Let's discuss: What area of big data management interests you most?•10 minutes
Let's discuss: What are the design criteria in the big data applications you have heard?•10 minutes
Big Data Modeling
Module 2•4 hours to complete
Module details
Modeling big data depends on many factors including data structure, which operations may be performed on the data, and what constraints are placed on the models. In these lessons you will learn the details about big data modeling and you will gain the practical skills you will need for modeling your own big data projects.
Exploring the Relational Data Model of CSV Files•4 minutes
Exploring the Semistructured Data Model of JSON data•3 minutes
Exploring the Array Data Model of an Image•3 minutes
Exploring Sensor Data•4 minutes
12 readings•Total 120 minutes
Slides: What Is A Data Model?•10 minutes
Introduction to CSV Data (OpenOffice)•10 minutes
Introduction to CSV Data (Microsoft Excel)•10 minutes
Slides: What Is A Relational Data Model?•10 minutes
Slides: What is a Semistructured Data Model?•10 minutes
Exploring the Relational Data Model of Comma Separated Values (OpenOffice)•10 minutes
Exploring the Relational Data Model of Comma Separated Values (Excel)•10 minutes
Installing Python•10 minutes
Creating a Python Virtual Environment•10 minutes
Exploring the Semistructured Data Model of JSON data•10 minutes
Exploring the Array Data Model of an Image•10 minutes
Exploring Sensor Data•10 minutes
1 assignment•Total 30 minutes
Practical Quiz for Week 2 Hands-On Lectures•30 minutes
2 discussion prompts•Total 20 minutes
Let's discuss: Modeling data in your daily life•10 minutes
Let's discuss: Utilization of XML or JSON on the Internet•10 minutes
Big Data Modeling (Part 2)
Module 3•2 hours to complete
Module details
These lessons continue to shed light on big data modeling with specific approaches including vector space models, graph data models, and more.
What's included
5 videos5 readings1 assignment
Show info about module content
5 videos•Total 30 minutes
Vector Space Model•11 minutes
Graph Data Model•7 minutes
Other Data Models•4 minutes
Exploring the Lucene Search Engine's Vector Data Model•4 minutes
Exploring Graph Data Models with Gephi•3 minutes
5 readings•Total 50 minutes
Slides: Vector Space Model•10 minutes
Slides: Graph Data Model•10 minutes
Slides: Other Data Models•10 minutes
Exploring Vector Data Models with Lucene•10 minutes
Exploring Graph Data Models with Gephi•10 minutes
1 assignment•Total 30 minutes
Data Models Quiz•30 minutes
Working With Data Models
Module 4•2 hours to complete
Module details
Data models deal with many different types of data formats. Streaming data is becoming ubiquitous, and working with streaming data requires a different approach from working with static data. In these lessons you will gain practical hands-on experience working with different forms of streaming data including weather data and twitter feeds.
What's included
5 videos5 readings1 assignment1 discussion prompt
Show info about module content
5 videos•Total 24 minutes
Data Model vs. Data Format•2 minutes
What is a Data Stream?•6 minutes
Why is Streaming Data different?•7 minutes
Understanding Data Lakes•6 minutes
Exploring Streaming Sensor Data•3 minutes
5 readings•Total 50 minutes
Slides: Data Model vs. Data Format•10 minutes
Slides: What is a Data Stream?•10 minutes
Slides: Why is Streaming Data Different?•10 minutes
Slides: Understanding Data Lakes•10 minutes
Exploring Streaming Sensor Data•10 minutes
1 assignment•Total 30 minutes
Data Formats and Streaming Data Quiz•30 minutes
1 discussion prompt•Total 10 minutes
Let's discuss: Streaming data applications•10 minutes
Big Data Management: The "M" in DBMS
Module 5•2 hours to complete
Module details
Managing big data requires a different approach to database management systems because of the wide variation in data structure which does not lend itself to traditional DBMSs. There are many applications available to help with big data management. In these lessons we introduce you to some of these applications and provide insight into how and when they might be appropriate for your own big data management challenges.
What's included
7 videos2 readings1 assignment
Show info about module content
7 videos•Total 71 minutes
DBMS-based and non-DBMS-based Approaches to Big Data•16 minutes
From DBMS to BDMS•10 minutes
Redis: An Enhanced Key-Value Store•8 minutes
Aerospike: a New Generation KV Store•9 minutes
Semistructured Data – AsterixDB•8 minutes
Solr: Managing Text•10 minutes
Relational Data – Vertica•10 minutes
2 readings•Total 20 minutes
Slides: DBMS-based and non-DBMS-based Approaches to Big Data•10 minutes
Slides: From DBMS to BDMS•10 minutes
1 assignment•Total 30 minutes
BDMS Quiz•30 minutes
Designing a Big Data Management System for an Online Game
Module 6•2 hours to complete
Module details
In these lessons we give you the opportunity to learn about big data modeling and management using a fictitious online game called "Catch the Pink Flamingo".
What's included
1 reading1 peer review2 discussion prompts
Show info about module content
1 reading•Total 10 minutes
A Game by Eglence Inc. : Catch The Pink Flamingo•10 minutes
1 peer review•Total 120 minutes
Designing a Data Model for 'Catch the Pink Flamingo'•120 minutes
2 discussion prompts•Total 10 minutes
Let's discuss: Analytical tasks to make Catch the Pink Flamingo better•5 minutes
Let's discuss: Using the data model for Catch the Pink Flamingo•5 minutes
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructors
Instructor ratings
Instructor ratings
We asked all learners to give feedback on our instructors based on the quality of their teaching style.
UC San Diego is an academic powerhouse and economic engine, recognized as one of the top 10 public universities by U.S. News and World Report. Innovation is central to who we are and what we do. Here, students learn that knowledge isn't just acquired in the classroom—life is their laboratory.
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Learner reviews
4.4
3,026 reviews
5 stars
58.53%
4 stars
27.94%
3 stars
9.05%
2 stars
2.54%
1 star
1.91%
Showing 3 of 3026
J
JW
5·
Reviewed on May 7, 2019
I feel as though the assessment questions could have been more specific and the assessment criteria when marking could have been more precise. But other than that it was a great course.
P
PP
4·
Reviewed on Mar 14, 2020
If you have the fundamental knowledge of database and json, the only valuable videos for you are in week 5. As it says in next course(course 3), the course 2 is not required.
Y
YG
4·
Reviewed on Oct 30, 2016
Overall relevant and clear presentations. Course material quite general, but I guess this is expected from an introductory-level course.Peer-reviewed assignment's instructions can be clearer.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.