Learn to build data pipelines on the Databricks Lakehouse Platform — from architecture concepts to hands-on Spark and Delta Lake. This beginner course starts with why the lakehouse pattern replaced separate data warehouses and data lakes, then moves directly into the Databricks workspace where you'll configure compute, write PySpark and SQL queries, and manage data with Unity Catalog's three-level namespace.

Databricks Lakehouse Fundamentals

Databricks Lakehouse Fundamentals
This course is part of Enterprise AI and Data Engineering with Databricks Specialization

Instructor: Noah Gift
Access provided by American University of Bahrain
Recommended experience
What you'll learn
Write PySpark and SparkSQL queries using lazy evaluation, the Catalyst optimizer, and broadcast join optimization
Schedule end-to-end data pipelines as multi-task Databricks Jobs with dashboards and alerting
Build and query Delta Lake tables with ACID transactions, schema enforcement, time travel, and MERGE-based incremental ETL
Details to know

Add to your LinkedIn profile
4 assignments
March 2026
See how employees at top companies are mastering in-demand skills

Build your subject-matter expertise
- Learn new concepts from industry experts
- Gain a foundational understanding of a subject or tool
- Develop job-relevant skills with hands-on projects
- Earn a shareable career certificate

There are 4 modules in this course
This module introduces the lakehouse paradigm and the Databricks platform. You'll learn about the structure of lakehouse architecture, explore the Databricks workspace and its core tools, and understand how compute and storage work together.
What's included
6 videos7 readings1 assignment
This module covers notebooks and hands-on data manipulation using PySpark. You'll create and organize notebooks, load data from the Catalog, and write PySpark transformations to select, filter, aggregate, and join datasets.
What's included
6 videos4 readings1 assignment
This module introduces Delta Lake, where you'll create Delta tables, perform transactional operations like updates, deletes, and merges, use time travel to query previous versions, and see how Delta Lake connects to governance and automation features.
What's included
6 videos4 readings1 assignment
Build an end-to-end lakehouse data pipeline integrating every concept from the course. Starting from raw data files, you will construct a complete medallion architecture (bronze → silver → gold) with Delta Lake, implement incremental MERGE logic, and orchestrate the pipeline as a scheduled Databricks Job. Six hands-on lab notebooks guide you through the project using the course GitHub repository.
What's included
1 reading1 assignment
Earn a career certificate
Add this credential to your LinkedIn profile, resume, or CV. Share it on social media and in your performance review.
Instructor

Offered by
Why people choose Coursera for their career

Felipe M.

Jennifer J.

Larry W.






