About this Course
9,057

Course 4 of 6 in the

Start working towards your degree

Try out lectures, course readings, and self-paced assignments from the Master in Computer Science degree

100% online

Start instantly and learn at your own schedule.

Flexible deadlines

Reset deadlines in accordance to your schedule.

Approx. 15 hours to complete

Suggested: There is about 3-4 hours of video lectures per week. Each week's quiz takes about 30 minutes. ...

English

Subtitles: English, Korean

Skills you will gain

GraphsDistributed ComputingBig DataMachine Learning

Course 4 of 6 in the

Start working towards your degree

Try out lectures, course readings, and self-paced assignments from the Master in Computer Science degree

100% online

Start instantly and learn at your own schedule.

Flexible deadlines

Reset deadlines in accordance to your schedule.

Approx. 15 hours to complete

Suggested: There is about 3-4 hours of video lectures per week. Each week's quiz takes about 30 minutes. ...

English

Subtitles: English, Korean

Syllabus - What you will learn from this course

Week
1
3 hours to complete

Course Orientation

You will become familiar with the course, your classmates, and our learning environment. The orientation will also help you obtain the technical skills required for the course....
1 video (Total 26 min), 4 readings, 1 quiz
4 readings
Syllabus10m
About the Discussion Forums10m
Updating Your Profile10m
Social Media10m
1 practice exercise
Orientation Quiz10m
2 hours to complete

Module 1: Spark, Hortonworks, HDFS, CAP

In Module 1, we introduce you to the world of Big Data applications. We start by introducing you to Apache Spark, a common framework used for many different tasks throughout the course. We then introduce some Big Data distro packages, the HDFS file system, and finally the idea of batch-based Big Data processing using the MapReduce programming paradigm. ...
13 videos (Total 108 min), 1 reading, 1 quiz
13 videos
1.1.2 Apache Spark11m
1.1.3 Spark Example: Log Mining9m
1.1.4 Spark Example: Logistic Regression7m
1.1.5 RDD Fault Tolerance4m
1.1.6 Interactive Spark4m
1.1.7 Spark Implementation4m
1.2.1 Introduction to Distros3m
1.2.2 Hortonworks23m
1.2.3 Cloudera CDH2m
1.2.4 MapR Distro2m
1.3.1 HDFS Introduction15m
1.3.2 YARN and MESOS9m
1 reading
Module 1 Overview10m
1 practice exercise
Module 1 Quiz30m
Week
2
6 hours to complete

Module 2: Large Scale Data Storage

In this module, you will learn about large scale data storage technologies and frameworks. We start by exploring the challenges of storing large data in distributed systems. We then discuss in-memory key/value storage systems, NoSQL distributed databases, and distributed publish/subscribe queues. ...
24 videos (Total 303 min), 1 reading, 1 quiz
24 videos
2.1.1 Introduction to MapReduce with Spark3m
2.1.2 MapReduce: Motivation15m
2.1.3 MapReduce Programming Model with Spark9m
2.1.4 MapReduce Example: Word Count9m
2.1.5 MapReduce Example: Pi Estimation & Image Smoothing15m
2.1.6 MapReduce Example: Page Rank13m
2.1.7 MapReduce Summary4m
2.2.1 Eventual Consistency – Part 110m
2.2.2 Eventual Consistency – Part 220m
2.2.3 Consistency Trade-Offs4m
2.2.4 ACID and BASE19m
2.2.5 Zookeeper and Paxos: Introduction10m
2.2.6 Paxos17m
2.2.7 Zookeeper16m
2.3.1 Cassandra Introduction27m
2.3.2 Redis7m
2.3.3 Redis Demonstration14m
2.4.1 HBase Usage API15m
2.4.2 HBase Internals - Part 117m
2.4.3 HBase Internals - Part 29m
2.4.4 Spark SQL8m
2.5.5 Spark SQL Demo8m
2.5.1 Kafka17m
1 reading
Module 2 Overview10m
1 practice exercise
Module 2 Quiz30m
Week
3
4 hours to complete

Module 3: Streaming Systems

This module introduces you to real-time streaming systems, also known as Fast Data. We talk about Apache Storm in length, Apache Spark Streaming, and Lambda and Kappa architectures. Finally, we contrast all these technologies as a streaming ecosystem. ...
18 videos (Total 216 min), 1 reading, 1 quiz
18 videos
3.1.1 Streaming Introduction9m
3.1.2 "Big Data Pipelines: The Rise of Real-Time"7m
3.1.3 Storm Introduction: Protocol Buffers & Thrift15m
3.1.4 A Storm Word Count Example3m
3.1.5 Writing the Storm Word Count Example10m
3.1.6 Storm Usage at Yahoo3m
3.2.1 Anchoring and Spout Replay17m
3.2.2 Trident: Exactly Once Processing10m
3.3.1 Inside Apache Storm9m
3.3.2 The Structure of a Storm Cluster4m
3.3.3 Using Thrift in Storm10m
3.3.4 How Storm Schedulers Work12m
3.3.5 Scaling Storm to 4000 Nodes14m
3.3.6 Q&A with Bobby Evans (Yahoo) on Storm32m
3.4.1 Spark Streaming18m
3.4.2 Lambda and Kappa Architecture4m
3.4.3 Streaming Ecosystem24m
1 reading
Module 3 Overview10m
1 practice exercise
Module 3 Quiz30m
Week
4
4 hours to complete

Module 4: Graph Processing and Machine Learning

In this module, we discuss the applications of Big Data. In particular, we focus on two topics: graph processing, where massive graphs (such as the web graph) are processed for information, and machine learning, where massive amounts of data are used to train models such as clustering algorithms and frequent pattern mining. We also introduce you to deep learning, where large data sets are used to train neural networks with effective results. ...
18 videos (Total 173 min), 1 reading, 1 quiz
18 videos
4.1.2 Pregel - Part 17m
4.1.3 Pregel - Part 211m
4.1.4 Pregel - Part 36m
4.1.5 Giraph Introduction6m
4.1.6 Giraph Example4m
4.1.7 Spark GraphX15m
4.2.1 Big Data Machine Learning Introduction13m
4.2.2 Mahout: Introduction8m
4.2.3 Mahout kmeans5m
4.2.4 Mahout: Naïve Bayes9m
4.2.5 Mahout: fpm6m
4.2.6 Spark Naïve Bayes2m
4.2.7 Spark fpm2m
4.2.8 Spark ML/MLlib11m
4.2.9 Introduction to Deep Learning20m
4.2.10 Deep Neural Network Systems17m
4.3.1 Closing Remarks1m
1 reading
Module 4 Overview10m
1 practice exercise
Module 4 Quiz30m
4.2
32 ReviewsChevron Right

Top Reviews

By UNApr 10th 2018

My understanding of Big Data technologies was really enhanced by this course. I have decided to pursue more of these underlying technologies after this course. Good job

By MSNov 27th 2017

Very good introduction of application concepts of cloud data computing. Thank You!

Instructors

Avatar

Reza Farivar

Data Engineering Manager at Capital One, Adjunct Research Assistant Professor of Computer Science
Department of Computer Science
Avatar

Roy H. Campbell

Professor of Computer Science
Department of Computer Science

Get a head start on your degree

This course is part of the 100% online Master in Computer Science from University of Illinois at Urbana-Champaign. Start an open course or Specialization today to watch courses featuring iMBA faculty and complete self-paced assignments. When you complete each course, you’ll earn a certificate that you can add to your LinkedIn and resume. If you apply and are admitted to the full program, your courses count towards your degree learning.

About University of Illinois at Urbana-Champaign

The University of Illinois at Urbana-Champaign is a world leader in research, teaching and public engagement, distinguished by the breadth of its programs, broad academic excellence, and internationally renowned faculty and alumni. Illinois serves the world by creating knowledge, preparing students for lives of impact, and finding solutions to critical societal needs. ...

About the Cloud Computing Specialization

The Cloud Computing Specialization takes you on a tour through cloud computing systems. We start in in the middle layer with Cloud Computing Concepts covering core distributed systems concepts used inside clouds, move to the upper layer of Cloud Applications and finally to the lower layer of Cloud Networking. We conclude with a project that allows you to apply the skills you've learned throughout the courses. The first four courses in this Specialization form the lecture component of courses in our online Master of Computer Science Degree in Data Science. You can apply to the degree program either before or after you begin the Specialization....
Cloud Computing

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

More questions? Visit the Learner Help Center.