DP-3011 Implementing a Data Analytics Solution with Azure Databricks
- 1 Day Course
- Language: English
Introduction:
This course explores how data professionals can prepare and analyze data with Azure Databricks using Apache Spark’s distributed computing capabilities. Students learn to manage data quality and versioning with Delta Lake, build automated pipelines using Delta Live Tables, and implement governance through Unity Catalog. The course demonstrates workflow orchestration for production deployments and collaborative development using Python and SQL notebooks for serving prepared data at scale.
Objectives:
Course Outline:
1 – Explore Azure Databricks
- Get started with Azure Databricks
- Identify Azure Databricks workloads
- Understand key concepts
- Data governance using Unity Catalog and Microsoft Purview
- Module assessment
2 – Perform data analysis with Azure Databricks
- Ingest data with Azure Databricks
- Data exploration tools in Azure Databricks
- Data analysis using DataFrame APIs
- Module assessment
3 – Use Apache Spark in Azure Databricks
- Get to know Spark
- Create a Spark cluster
- Use Spark in notebooks
- Use Spark to work with data files
- Visualize data
- Module assessment
4 – Manage data with Delta Lake
- Get started with Delta Lake
- Manage ACID transactions
- Implement schema enforcement
- Data versioning and time travel in Delta Lake
- Data integrity with Delta Lake
- Module assessment
5 – Build data pipelines with Delta Live Tables
- Explore Delta Live Tables
- Data ingestion and integration
- Real-time processing
- Module assessment
6 – Deploy workloads with Azure Databricks Workflows
- What are Azure Databricks Workflows?
- Understand key components of Azure Databricks Workflows
- Explore the benefits of Azure Databricks Workflows
- Deploy workloads using Azure Databricks Workflows
- Module assessment
Enroll in this course
$941.76