In this course, you'll learn how to manage big datasets, how to load them into clusters and cloud storage, and how to apply structure to the data so that you can run queries on it using distributed SQL engines like Apache Hive and Apache Impala. You’ll learn how to choose the right data types, storage systems, and file formats based on which tools you’ll use and what performance you need.
Use different tools to browse existing databases and tables in big data systems
Use different tools to explore files in distributed big data filesystems and cloud storage
Create and manage big data databases and tables using Apache Hive and Apache Impala
Describe and choose among different data types and file formats for big data systems
- Data Management
- Distributed File Systems
- Cloud Storage
- Big Data
- SQL
At Cloudera, we believe that data can make what is impossible today, possible tomorrow. We empower people to transform complex data into clear and actionable insights. Cloudera delivers an enterprise data cloud for any data, anywhere, from the Edge to AI. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.
Orientation to Data in Clusters and Cloud Storage
Defining Databases, Tables, and Columns
Data Types and File Types
Managing Datasets in Clusters and Cloud Storage
Very good material and the labs using the VM are wonderful hands-on experience.
Absolutely amazing! Have learnt a lot from this course!
Very good course with lots of relevant skills and information learned. The hands-on assignment has some decent challenging parts to it too!
This is one of the systematic specializations which makes the harder and otherwise overwhelming subject so easy to navigate, follow and learn.
About the Modern Big Data Analysis with SQL Specialization
This Specialization teaches the essential skills for working with large-scale data using SQL.
