Data Engineering on Microsoft Azure (DP-203/DP-203T00)
In this course, the student will learn how to implement and manage data engineering workloads on Microsoft Azure, using Azure services such as Azure Synapse Analytics, Azure Data Lake Storage Gen2, Azure Stream Analytics, Azure Databricks, and others. The course focuses on common data engineering tasks such as orchestrating data transfer and transformation pipelines, working with data files in a data lake, creating and loading relational data warehouses, capturing and aggregating streams of real-time data, and tracking data assets and lineage.
After completing this course, you’ll be able to design, build, and manage end-to-end data engineering solutions on Microsoft Azure. You’ll gain practical experience in orchestrating data pipelines, transforming data for analytics, and applying best practices in security and monitoring.
- Implement Azure data storage solutions, including Data Lake and Synapse
- Build and manage data pipelines for batch and streaming workloads
- Transform and analyze data using Spark and Databricks
- Secure data environments with Azure authentication and encryption
- Monitor, troubleshoot, and optimize data solutions for performance
This course is ideal for data engineers, data architects, business intelligence professionals, and anyone responsible for designing or implementing data solutions in the Microsoft Azure ecosystem. Data analysts and data scientists seeking hands-on experience with Azure tools will also benefit.
- Understanding of cloud computing and core data concepts
- Experience working with data solutions
- Recommended: AZ-900T00 Microsoft Azure Fundamentals and/or DP-900T00 Microsoft Azure Data Fundamentals
- Data engineering concepts and practices
- Microsoft Azure for data engineering
- Features and setup
- Compare with Blob storage
- Use in analytics and big data processing
- Overview, use cases, and architecture
- Serverless SQL pools for querying and transforming data
- Lake databases and external objects
- Pipelines and Spark notebooks in Synapse
- Authentication, users, and permissions
- Compute scaling, workload management, and monitoring
- Data security: Conditional Access, encryption, row/column security, data masking
- Schema design and table creation
- Loading staging, dimension, and fact tables
- Optimization and querying strategies
- Synapse Link with Cosmos DB and SQL
- Stream Analytics ingestion, event processing, and windowing
- Real-time visualization with Power BI
- Overview, workloads, and governance (Unity Catalog, Purview)
- Data ingestion, exploration, and analysis with Spark and DataFrames
- Visualization with Spark notebooks
- Delta Lake: ACID transactions, schema enforcement, versioning
- Delta Live Tables for real-time data pipelines
- Databricks Workflows for deploying workloads