About this Course
4.4
1,218 ratings
269 reviews
Specialization

Course 3 of 6 in the

100% online

100% online

Start instantly and learn at your own schedule.
Flexible deadlines

Flexible deadlines

Reset deadlines in accordance to your schedule.
Beginner Level

Beginner Level

Hours to complete

Approx. 19 hours to complete

Suggested: 7 hours/week...
Available languages

English

Subtitles: English

Skills you will gain

Big DataMongodbSplunkApache Spark
Specialization

Course 3 of 6 in the

100% online

100% online

Start instantly and learn at your own schedule.
Flexible deadlines

Flexible deadlines

Reset deadlines in accordance to your schedule.
Beginner Level

Beginner Level

Hours to complete

Approx. 19 hours to complete

Suggested: 7 hours/week...
Available languages

English

Subtitles: English

Syllabus - What you will learn from this course

Week
1
Hours to complete
1 hour to complete

Welcome to Big Data Integration and Processing

Welcome to the third course in the Big Data Specialization. This week you will be introduced to basic concepts in big data integration and processing. You will be guided through installing the Cloudera VM, downloading the data sets to be used for this course, and learning how to run the Jupyter server. ...
Reading
3 videos (Total 18 min), 6 readings
Video3 videos
Summary of Big Data Modeling and Management7m
Why is Big Data Processing Different?8m
Reading6 readings
Slides: Summary & Why Is Big Data Processing Different10m
Downloading and Installing the Cloudera VM Instructions (Windows)10m
Downloading and Installing the Cloudera VM Instructions (Mac)10m
Software Installation Frequently Asked Questions (FAQ)10m
Instructions for Downloading Hands On Datasets10m
Instructions for Starting Jupyter10m
Hours to complete
1 hour to complete

Retrieving Big Data (Part 1)

This module covers the various aspects of data retrieval and relational querying. You will also be introduced to the Postgres database. ...
Reading
5 videos (Total 40 min), 2 readings
Video5 videos
What is Data Retrieval? Part 27m
Querying Two Relations8m
Subqueries8m
Querying Relational Data with Postgres6m
Reading2 readings
Slides: What is Data Retrieval?10m
Querying Relational Data with Postgres20m
Week
2
Hours to complete
2 hours to complete

Retrieving Big Data (Part 2)

This module covers the various aspects of data retrieval for NoSQL data, as well as data aggregation and working with data frames. You will be introduced to MongoDB and Aerospike, and you will learn how to use Pandas to retrieve data from them....
Reading
5 videos (Total 50 min), 3 readings, 2 quizzes
Video5 videos
Aggregation Functions9m
Querying Aerospike6m
Querying Documents in MongoDB11m
Exploring Pandas DataFrames5m
Reading3 readings
Slides: Querying Data Part 210m
Querying Documents in MongoDB10m
Exploring Pandas DataFrames20m
Quiz2 practice exercises
Retrieving Big Data Quiz20m
Postgres, MongoDB, and Pandas20m
Week
3
Hours to complete
3 hours to complete

Big Data Integration

In this module you will be introduced to data integration tools including Splunk and Datameer, and you will gain some practical insight into how information integration processes are carried out. ...
Reading
11 videos (Total 83 min), 4 readings, 2 quizzes
Video11 videos
A Data Integration Scenario13m
Integration for Multichannel Customer Analytics6m
Big Data Management and Processing Using Splunk and Datameer1m
Why Splunk?3m
Connected Cars with Ford's OpenXC and Splunk3m
Big Data Management and Processing using Datameer15m
Installing Splunk Enterprise on Windows2m
Installing Splunk Enterprise on Linux4m
Exploring Splunk Queries5m
Optional: Creating Pivot Reports in Splunk8m
Reading4 readings
Slides: Information Integration10m
Downloading Splunk Enterprise10m
Exploring Splunk Queries20m
Optional: Instructions for Splunk Pivot Tutorial10m
Quiz2 practice exercises
Information Integration - Quiz14m
Hands-On With Splunk15m
Week
4
Hours to complete
3 hours to complete

Processing Big Data

This module introduces Learners to big data pipelines and workflows as well as processing and analysis of big data using Apache Spark. ...
Reading
9 videos (Total 74 min), 4 readings, 2 quizzes
Video9 videos
Some High-Level Processing Operations in Big Data Pipelines8m
Aggregation Operations in Big Data Pipelines5m
Typical Analytical Operations in Big Data Pipelines10m
Overview of Big Data Processing Systems7m
The Integration and Processing Layer8m
Introduction to Apache Spark8m
Getting Started with Spark10m
WordCount in Spark8m
Reading4 readings
Big Data Processing Pipelines Slides10m
Big Data Workflow Management10m
Slides for Big Data Processing Tools and Systems10m
WordCount in Spark20m
Quiz2 practice exercises
Pipeline and Tools18m
WordCount in Spark8m
4.4
269 ReviewsChevron Right
Career direction

33%

started a new career after completing these courses
Career Benefit

27%

got a tangible career benefit from this course
Career promotion

20%

got a pay increase or promotion

Top Reviews

By AAMar 6th 2018

It was a good course, it could have been better if some examples of Spark were also provided in other Languages like Java, people without having background of python may find it difficult.

By DCOct 8th 2017

Very Interactive course. Theatrical classes are nicely drafted. Hands On exercises are interesting and some are challenging too. Overall very interesting course. Happy learning

Instructors

Avatar

Ilkay Altintas

Chief Data Science Officer
San Diego Supercomputer Center
Avatar

Amarnath Gupta

Director, Advanced Query Processing Lab
San Diego Supercomputer Center (SDSC)

About University of California San Diego

UC San Diego is an academic powerhouse and economic engine, recognized as one of the top 10 public universities by U.S. News and World Report. Innovation is central to who we are and what we do. Here, students learn that knowledge isn't just acquired in the classroom—life is their laboratory....

About the Big Data Specialization

Drive better business decisions with an overview of how big data is organized, analyzed, and interpreted. Apply your insights to real-world problems and questions. ********* Do you need to understand big data and how it will impact your business? This Specialization is for you. You will gain an understanding of what insights big data can provide through hands-on experience with the tools and systems used by big data scientists and engineers. Previous programming experience is not required! You will be guided through the basics of using Hadoop with MapReduce, Spark, Pig and Hive. By following along with provided code, you will experience how one can perform predictive modeling and leverage graph analytics to model problems. This specialization will prepare you to ask the right questions about data, communicate effectively with data scientists, and do basic exploration of large, complex datasets. In the final Capstone Project, developed in partnership with data software company Splunk, you’ll apply the skills you learned to do basic analyses of big data....
Big Data

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

More questions? Visit the Learner Help Center.