About this Course
34,007 recent views

100% online

Start instantly and learn at your own schedule.

Flexible deadlines

Reset deadlines in accordance to your schedule.

Beginner Level

Approx. 23 hours to complete

Suggested: 10 hours/week...

English

Subtitles: English

What you will learn

  • Check

    Understand the basics of SELECT statements

  • Check

    Understand how and why to filter results

  • Check

    Explore grouping and aggregation to answer analytic questions

  • Check

    Work with sorting and limiting results

Skills you will gain

Apache HiveApache ImpalaData AnalysisBig DataSQL
Learners taking this Course are
  • Machine Learning Engineers
  • Data Scientists
  • Data Engineers
  • Business Analysts
  • Financial Analysts

100% online

Start instantly and learn at your own schedule.

Flexible deadlines

Reset deadlines in accordance to your schedule.

Beginner Level

Approx. 23 hours to complete

Suggested: 10 hours/week...

English

Subtitles: English

Syllabus - What you will learn from this course

Week
1
3 hours to complete

Orientation to SQL on Big Data

9 videos (Total 47 min), 5 readings, 2 quizzes
9 videos
Review and Preparation4m
Using the Hue Query Editors7m
Running SQL Utility Statements6m
Running SQL SELECT Statements5m
Understanding Different SQL Interfaces4m
Overview of Beeline and Impala Shell2m
Using Beeline8m
Using Impala Shell3m
5 readings
Instructions for Downloading and Installing the Exercise Environment30m
Troubleshooting the VM5m
(Optional) What about Spark SQL?10m
Expectations for Learners10m
(Optional) Using Other SQL Engines10m
2 practice exercises
Week 1 Core Quiz30m
Week 1 Honors Quiz5m
Week
2
3 hours to complete

SQL SELECT Essentials

16 videos (Total 83 min), 4 readings, 2 quizzes
16 videos
SQL SELECT Building Blocks2m
Introduction to the SELECT List7m
Expressions and Operators7m
Data Types6m
Column Aliases5m
Built-In Functions7m
Data Type Conversion5m
The DISTINCT Keyword5m
Introduction to the FROM Clause3m
Identifiers7m
Formatting SELECT Statements4m
Using Beeline in Non-Interactive Mode5m
Using Impala Shell in Non-Interactive Mode4m
Formatting the Output of Beeline and Impala Shell4m
Saving Hive and Impala Query Results to a File5m
4 readings
Order of Operations5m
Division and Modulo Operators15m
Common String Functions15m
Case (In)Sensitivity in SQL10m
2 practice exercises
Week 2 Core Quiz30m
Week 2 Honors Quiz5m
Week
3
3 hours to complete

Filtering Data

14 videos (Total 85 min), 6 readings, 2 quizzes
14 videos
About the Datasets4m
Introduction to the WHERE Clause2m
Using Expressions in the WHERE Clause9m
Comparison Operators9m
Data Types and Precision4m
Logical Operators7m
Other Relational Operators4m
Understanding Missing Values8m
Handling Missing Values6m
Conditional Functions9m
Using Variables with Beeline and Impala Shell7m
Calling Beeline and Impala Shell from Scripts6m
Querying Hive and Impala in Scripts and Applications2m
6 readings
Data Reference5m
(Optional) Unicode Characters10m
Working with Literal Strings15m
Missing Values with Logical Operators10m
Missing Values in String Columns5m
(Optional Exercise) Change VM Desktop Color30m
2 practice exercises
Week 3 Core Quiz30m
Week 3 Honors Quiz5m
Week
4
3 hours to complete

Grouping and Aggregating Data

15 videos (Total 82 min), 6 readings, 2 quizzes
15 videos
Introduction to Aggregation2m
Common Aggregate Functions2m
Using Aggregate Functions in the SELECT Statement8m
Introduction to the GROUP BY Clause6m
Choosing an Aggregate Function and Grouping Column4m
Grouping Expressions6m
Grouping and Aggregation, Together and Separately5m
NULL Values in Grouping and Aggregation4m
The COUNT Function7m
Tips for Applying Grouping and Aggregation7m
Filtering on Aggregates2m
The HAVING Clause8m
Understanding Hive and Impala Version Differences10m
Understanding Hue Version Differences2m
6 readings
COUNT(*) and SUM(1)5m
Interpreting Aggregates: Populations and Samples10m
The least and greatest Functions5m
Why Aggregate Expressions Ignore NULL Values5m
(Optional) Shortcuts for Grouping10m
How Grouping and Aggregation Can Mislead10m
2 practice exercises
Week 4 Core Quiz30m
Week 4 Honors Quiz10m

Instructor

Avatar

Ian Cook

Senior Curriculum Developer
Cloudera

About Cloudera

At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower people to transform complex data into clear and actionable insights. Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises. ...

About the Modern Big Data Analysis with SQL Specialization

This Specialization teaches the essential skills for working with large-scale data using SQL. Maybe you are new to SQL and you want to learn the basics. Or maybe you already have some experience using SQL to query smaller-scale data with relational databases. Either way, if you are interested in gaining the skills necessary to query big data with modern distributed SQL engines, this Specialization is for you. Most courses that teach SQL focus on traditional relational databases, but today, more and more of the data that’s being generated is too big to be stored there, and it’s growing too quickly to be efficiently stored in commercial data warehouses. Instead, it’s increasingly stored in distributed clusters and cloud storage. These data stores are cost-efficient and infinitely scalable. To query these huge datasets in clusters and cloud storage, you need a newer breed of SQL engine: distributed query engines, like Hive, Impala, Presto, and Drill. These are open source SQL engines capable of querying enormous datasets. This Specialization focuses on Hive and Impala, the most widely deployed of these query engines. This Specialization is designed to provide excellent preparation for the Cloudera Certified Associate (CCA) Data Analyst certification exam. You can earn this certification credential by taking a hands-on practical exam using the same SQL engines that this Specialization teaches—Hive and Impala....
Modern Big Data Analysis with SQL

Frequently Asked Questions

  • Once you enroll for a Certificate, you’ll have access to all videos, quizzes, and programming assignments (if applicable). Peer review assignments can only be submitted and reviewed once your session has begun. If you choose to explore the course without purchasing, you may not be able to access certain assignments.

  • When you enroll in the course, you get access to all of the courses in the Specialization, and you earn a certificate when you complete the work. Your electronic Certificate will be added to your Accomplishments page - from there, you can print your Certificate or add it to your LinkedIn profile. If you only want to read and view the course content, you can audit the course for free.

  • • Windows, macOS, or Linux operating system (iPads and Android tablets will not work) • 64-bit operating system (32-bit operating systems will not work) • 8 GB RAM or more • 25GB free disk space or more • Intel VT-x or AMD-V virtualization support enabled (on Mac computers with Intel processors, this is always enabled; on Windows and Linux computers, you might need to enable it in the BIOS) • For Windows XP computers only: You must have an unzip utility such as 7-Zip or WinZip installed (Windows XP’s built-in unzip utility will not work)

More questions? Visit the Learner Help Center.