When you enroll in this course, you'll also be enrolled in this Specialization.
Learn new concepts from industry experts
Gain a foundational understanding of a subject or tool
Develop job-relevant skills with hands-on projects
Earn a shareable career certificate
There are 6 modules in this course
Interested in increasing your knowledge of the Big Data landscape? This course is for those new to data science and interested in understanding why the Big Data Era has come to be. It is for those who want to become conversant with the terminology and the core concepts behind big data problems, applications, and systems. It is for those who want to start thinking about how Big Data might be useful in their business or career. It provides an introduction to one of the most common frameworks, Hadoop, that has made big data analysis easier and more accessible -- increasing the potential for data to transform our world!
At the end of this course, you will be able to:
* Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: people, organizations, and sensors.
* Explain the V’s of Big Data (volume, velocity, variety, veracity, valence, and value) and why each impacts data collection, monitoring, storage, analysis and reporting.
* Get value out of Big Data by using a 5-step process to structure your analysis.
* Identify what are and what are not big data problems and be able to recast big data problems as data science questions.
* Provide an explanation of the architectural components and programming models used for scalable big data analysis.
* Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model.
* Install and run a program using Hadoop!
This course is for those new to data science. No prior programming experience is needed, although the ability to install applications and utilize a virtual machine is necessary to complete the hands-on assignments.
Hardware Requirements:
(A) Quad Core Processor (VT-x or AMD-V support recommended), 64-bit; (B) 8 GB RAM; (C) 20 GB disk free. How to find your hardware information: (Windows): Open System by clicking the Start button, right-clicking Computer, and then clicking Properties; (Mac): Open Overview by clicking on the Apple menu and clicking “About This Mac.” Most computers with 8 GB RAM purchased in the last 3 years will meet the minimum requirements.You will need a high speed internet connection because you will be downloading files up to 4 Gb in size.
Software Requirements:
This course relies on several open-source software tools, including Apache Hadoop. All required software can be downloaded and installed free of charge. Software requirements include: Windows 7+, Mac OS X 10.10+, Ubuntu 14.04+ or CentOS 6+ VirtualBox 5+.
Welcome to the Big Data Specialization! We're excited for you to get to know us and we're looking forward to learning about you!
What's included
2 videos2 readings1 discussion prompt
Show info about module content
2 videos•Total 3 minutes
Welcome to the Big Data Specialization•3 minutes
Tell us about yourself and learn about your classmates•0 minutes
2 readings•Total 12 minutes
By the end of this course you will be able to...•2 minutes
Optional: Watch this fun video about the San Diego Supercomputer Center!•10 minutes
1 discussion prompt•Total 10 minutes
Let's Discuss: Why are you taking this class?•10 minutes
Big Data: Why and Where
Module 2•4 hours to complete
Module details
Data -- it's been around (even digitally) for a while. What makes data "big" and where does this big data come from?
Slides: Big Data Generated By People: The Unstructured Challenge•10 minutes
Slides: Big Data Generated By People: How is it Being Used?•10 minutes
Slides: Organization-Generated Big Data: Structured But Often Siloed•10 minutes
Slides: Organizaton-Generated Big Data: Benefits•10 minutes
Slides: The Key - Integrating Diverse Data•10 minutes
1 assignment•Total 30 minutes
Why Big Data and Where Did it Come From?•30 minutes
2 discussion prompts•Total 20 minutes
Let's Discuss: What application area interests you?•10 minutes
Let's discuss: Who are you providing data to?•10 minutes
Characteristics of Big Data and Dimensions of Scalability
Module 3•3 hours to complete
Module details
You may have heard of the "Big Vs". We'll give examples and descriptions of the commonly discussed 5. But, we want to propose a 6th V and we'll ask you to practice writing Big Data questions targeting this V -- value.
Getting Started: Characteristics Of Big Data•3 minutes
Characteristics of Big Data - Volume•6 minutes
Characteristics of Big Data - Variety•5 minutes
Characteristics of Big Data - Velocity•7 minutes
Characteristics of Big Data - Veracity•6 minutes
Characteristics of Big Data - Valence•3 minutes
The Sixth V: Value•4 minutes
9 readings•Total 90 minutes
What does astronomical scale mean?•10 minutes
A Small Definition of Big Data•10 minutes
Slides: Getting Started - Characteristics of Big Data•10 minutes
Slides: Characteristics of Big Data - Volume•10 minutes
Slides: Characteristics of Big Data - Variety•10 minutes
Slides: Characteristics of Big Data - Velocity•10 minutes
Slides: Characteristics of Big Data - Veracity•10 minutes
Slides: Characteristics of Big Data - Value•10 minutes
Slides: Characteristics of Big Data - Valence•10 minutes
1 assignment•Total 14 minutes
V for the V's of Big Data•14 minutes
2 discussion prompts•Total 20 minutes
Practice: Writing Big Data questions•10 minutes
Let's Discuss: Improving the Flamingo Game•10 minutes
Data Science: Getting Value out of Big Data
Module 4•4 hours to complete
Module details
We love science and we love computing, don't get us wrong. But the reality is we care about Big Data because it can bring value to our companies, our lives, and the world. In this module we'll introduce a 5 step process for approaching data science problems.
Data Science: Getting Value out of Big Data•6 minutes
Building a Big Data Strategy•9 minutes
How does big data science happen?: Five Components of Data Science•10 minutes
Asking the Right Questions•3 minutes
Steps in the Data Science Process•4 minutes
Step 1: Acquiring Data•6 minutes
Step 2-A: Exploring Data•4 minutes
Step 2-B: Pre-Processing Data•8 minutes
Step 3: Analyzing Data•8 minutes
Step 4: Communicating Results•5 minutes
Step 5: Turning Insights into Action•3 minutes
12 readings•Total 120 minutes
Five P's of Data Science•10 minutes
Slides: Getting Value Out of Big Data•10 minutes
Slides: Building a Big Data Strategy•10 minutes
Slides: The Five P's of Data Science•10 minutes
Slides: Asking the Right Questions•10 minutes
Slides: Steps in the Data Science Process•10 minutes
Slides: Step 1 - Acquiring Data•10 minutes
Slides: Step 2A-Exploring Data•10 minutes
Slides: Step 2B-Preprocessing Data•10 minutes
Slides: Step 3-Data Analysis•10 minutes
Slides: Step 4-Communicating Results•10 minutes
Slides: Step 5-Turning Insights Into Action•10 minutes
1 assignment•Total 30 minutes
Data Science 101•30 minutes
2 discussion prompts•Total 20 minutes
Let's Discuss: Thinking more deeply about the Ps•10 minutes
Let's Discuss: Building a Team•10 minutes
Foundations for Big Data Systems and Programming
Module 5•1 hour to complete
Module details
Big Data requires new programming frameworks and systems. For this course, we don't programming knowledge or experience -- but we do want to give you a grounding in some of the key concepts.
What's included
4 videos4 readings1 assignment
Show info about module content
4 videos•Total 19 minutes
Getting Started: Why worry about foundations?•1 minute
What is a Distributed File System?•7 minutes
Scalable Computing over the Internet•4 minutes
Programming Models for Big Data•7 minutes
4 readings•Total 40 minutes
Slides: Getting Started-Why Worry About Foundations?•10 minutes
Slides: What is a Distributed File System?•10 minutes
Slides: Scalable Computing Over the Internet•10 minutes
Slides: Programming Models for Big Data•10 minutes
1 assignment•Total 20 minutes
Foundations for Big Data•20 minutes
Systems: Getting Started with Hadoop
Module 6•6 hours to complete
Module details
Let's look at some details of Hadoop and MapReduce. Then we'll go "hands on" and actually perform a simple MapReduce task using a Docker container. Pay attention - as we'll guide you in "learning by doing" in diagramming a MapReduce task as a Peer Review.
UC San Diego is an academic powerhouse and economic engine, recognized as one of the top 10 public universities by U.S. News and World Report. Innovation is central to who we are and what we do. Here, students learn that knowledge isn't just acquired in the classroom—life is their laboratory.
"To be able to take courses at my own pace and rhythm has been an amazing experience. I can learn whenever it fits my schedule and mood."
Jennifer J.
Learner since 2020
"I directly applied the concepts and skills I learned from my courses to an exciting new project at work."
Larry W.
Learner since 2021
"When I need courses on topics that my university doesn't offer, Coursera is one of the best places to go."
Chaitanya A.
"Learning isn't just about being better at your job: it's so much more than that. Coursera allows me to learn without limits."
Learner reviews
4.6
11,005 reviews
5 stars
70.10%
4 stars
23.62%
3 stars
4.17%
2 stars
1.01%
1 star
1.07%
Showing 3 of 11005
J
JT
5·
Reviewed on Aug 30, 2016
This is a great introduction for Big Data. It helps me to revisit what I learned from the meetups and webinars, then put the fundamental knowledge and information in a solid foundation. Thank you.
S
SS
5·
Reviewed on Sep 14, 2019
It is a comprehensive introduction to big data which covers significant components with enough content that can be absorb at this stage. A very good kick-start and excited for the next course ahead.
A
AR
5·
Reviewed on Mar 30, 2020
One of the best course to start learning new cutting-edge technology and to get deeper insights into Big Data. Thanks to the great instructors for amazing explanations of each module and e-materials.
When will I have access to the lectures and assignments?
To access the course materials, assignments and to earn a Certificate, you will need to purchase the Certificate experience when you enroll in a course. You can try a Free Trial instead, or apply for Financial Aid. The course may offer 'Full Course, No Certificate' instead. This option lets you see all course materials, submit required assessments, and get a final grade. This also means that you will not be able to purchase a Certificate experience.
What will I get if I subscribe to this Specialization?
When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile.
Is financial aid available?
Yes. In select learning programs, you can apply for financial aid or a scholarship if you can’t afford the enrollment fee. If fin aid or scholarship is available for your learning program selection, you’ll find a link to apply on the description page.