About this Course

88,681 recent views
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Flexible deadlines
Reset deadlines in accordance to your schedule.
Beginner Level
Approx. 17 hours to complete
English
Subtitles: French, Portuguese (Brazilian), Russian, English, Spanish

Skills you will gain

Apache HiveApache ImpalaData AnalysisBig DataSQL
Shareable Certificate
Earn a Certificate upon completion
100% online
Start instantly and learn at your own schedule.
Flexible deadlines
Reset deadlines in accordance to your schedule.
Beginner Level
Approx. 17 hours to complete
English
Subtitles: French, Portuguese (Brazilian), Russian, English, Spanish

Instructor

Offered by

Placeholder

Cloudera

Syllabus - What you will learn from this course

Week
1

Week 1

3 hours to complete

Orientation to SQL on Big Data

3 hours to complete
9 videos (Total 47 min), 5 readings, 2 quizzes
9 videos
Review and Preparation4m
Using the Hue Query Editors7m
Running SQL Utility Statements6m
Running SQL SELECT Statements5m
Understanding Different SQL Interfaces4m
Overview of Beeline and Impala Shell2m
Using Beeline8m
Using Impala Shell3m
5 readings
Instructions for Downloading and Installing the Exercise Environment30m
Troubleshooting the VM5m
(Optional) What about Spark SQL?10m
Expectations for Learners10m
(Optional) Using Other SQL Engines10m
2 practice exercises
Week 1 Core Quiz30m
Week 1 Honors Quiz5m
Week
2

Week 2

3 hours to complete

SQL SELECT Essentials

3 hours to complete
16 videos (Total 83 min), 4 readings, 2 quizzes
16 videos
SQL SELECT Building Blocks2m
Introduction to the SELECT List7m
Expressions and Operators7m
Data Types6m
Column Aliases5m
Built-In Functions7m
Data Type Conversion5m
The DISTINCT Keyword5m
Introduction to the FROM Clause3m
Identifiers7m
Formatting SELECT Statements4m
Using Beeline in Non-Interactive Mode5m
Using Impala Shell in Non-Interactive Mode4m
Formatting the Output of Beeline and Impala Shell4m
Saving Hive and Impala Query Results to a File5m
4 readings
Order of Operations5m
Division and Modulo Operators15m
Common String Functions15m
Case (In)Sensitivity in SQL10m
2 practice exercises
Week 2 Core Quiz30m
Week 2 Honors Quiz5m
Week
3

Week 3

3 hours to complete

Filtering Data

3 hours to complete
14 videos (Total 85 min), 6 readings, 2 quizzes
14 videos
About the Datasets4m
Introduction to the WHERE Clause2m
Using Expressions in the WHERE Clause9m
Comparison Operators9m
Data Types and Precision4m
Logical Operators7m
Other Relational Operators4m
Understanding Missing Values8m
Handling Missing Values6m
Conditional Functions9m
Using Variables with Beeline and Impala Shell7m
Calling Beeline and Impala Shell from Scripts6m
Querying Hive and Impala in Scripts and Applications2m
6 readings
Data Reference5m
(Optional) Unicode Characters10m
Working with Literal Strings15m
Missing Values with Logical Operators10m
Missing Values in String Columns5m
(Optional Exercise) Change VM Desktop Color30m
2 practice exercises
Week 3 Core Quiz30m
Week 3 Honors Quiz5m
Week
4

Week 4

3 hours to complete

Grouping and Aggregating Data

3 hours to complete
15 videos (Total 82 min), 6 readings, 2 quizzes
15 videos
Introduction to Aggregation2m
Common Aggregate Functions2m
Using Aggregate Functions in the SELECT Statement8m
Introduction to the GROUP BY Clause6m
Choosing an Aggregate Function and Grouping Column4m
Grouping Expressions6m
Grouping and Aggregation, Together and Separately5m
NULL Values in Grouping and Aggregation4m
The COUNT Function7m
Tips for Applying Grouping and Aggregation7m
Filtering on Aggregates2m
The HAVING Clause8m
Understanding Hive and Impala Version Differences10m
Understanding Hue Version Differences2m
6 readings
COUNT(*) and SUM(1)5m
Interpreting Aggregates: Populations and Samples10m
The least and greatest Functions5m
Why Aggregate Expressions Ignore NULL Values5m
(Optional) Shortcuts for Grouping10m
How Grouping and Aggregation Can Mislead10m
2 practice exercises
Week 4 Core Quiz30m
Week 4 Honors Quiz10m

Reviews

TOP REVIEWS FROM ANALYZING BIG DATA WITH SQL

View all reviews

About the Modern Big Data Analysis with SQL Specialization

This Specialization teaches the essential skills for working with large-scale data using SQL. Maybe you are new to SQL and you want to learn the basics. Or maybe you already have some experience using SQL to query smaller-scale data with relational databases. Either way, if you are interested in gaining the skills necessary to query big data with modern distributed SQL engines, this Specialization is for you. Most courses that teach SQL focus on traditional relational databases, but today, more and more of the data that’s being generated is too big to be stored there, and it’s growing too quickly to be efficiently stored in commercial data warehouses. Instead, it’s increasingly stored in distributed clusters and cloud storage. These data stores are cost-efficient and infinitely scalable. To query these huge datasets in clusters and cloud storage, you need a newer breed of SQL engine: distributed query engines, like Hive, Impala, Presto, and Drill. These are open source SQL engines capable of querying enormous datasets. This Specialization focuses on Hive and Impala, the most widely deployed of these query engines. This Specialization is designed to provide excellent preparation for the Cloudera Certified Associate (CCA) Data Analyst certification exam. You can earn this certification credential by taking a hands-on practical exam using the same SQL engines that this Specialization teaches—Hive and Impala....
Modern Big Data Analysis with SQL

Frequently Asked Questions

More questions? Visit the Learner Help Center.