Data Engineering on Google Cloud Platform Training
According to Google Cloud, data-driven companies are 23 times more likely to acquire customers and 19 times more likely to be profitable. But building the right infrastructure for data success requires the right skills—and the right cloud platform.
Data Engineering on Google Cloud Platform provides hands-on training in building scalable data pipelines, managing batch and streaming data, and applying machine learning to large datasets. Through immersive hands-on labs, you’ll work directly with Google Cloud Platform (GCP) tools like BigQuery, Cloud Dataflow, Cloud Composer, and Kubeflow to design solutions that drive business intelligence, improve agility, and support real-time decision-making.
This training helps professionals grow their cloud skills and prepare for the Professional Data Engineer certification. It’s ideal for anyone pursuing the role of a data engineer, especially those working with cloud data, big data, or real-time data processing needs.
- Design and implement scalable data pipelines on Google Cloud
- Analyze massive datasets using BigQuery, SQL, and machine learning
- Build both batch data and streaming pipelines with Dataflow and Pub/Sub
- Use Dataproc and Spark to manage big data workloads efficiently
- Automate and deploy AI workflows using BigQuery ML, AutoML, and Kubeflow
- Basic proficiency with a common query language such as SQL.
- Experience with data modeling and ETL (extract, transform, load) activities.
- Experience with developing applications using a common programming language such as Python.
- Familiarity with machine learning and/or statistics.
- Define the role of a data engineer on GCP
- Explore challenges in data processing and pipeline development
- Get started with BigQuery and its capabilities
- Compare data lakes and data warehouse models
- Hands-on lab: Analyze data using BigQuery
- Understand architecture for data lakes on Google Cloud
- Store structured and unstructured data in Cloud Storage
- Optimize with tiered storage and Cloud Functions
- Secure and manage data access
- Hands-on lab: Load taxi data into Cloud SQL
- Learn modern data warehouse architecture
- Perform advanced queries in BigQuery
- Use schemas, arrays, and nested fields
- Optimize partitioning and performance
- Hands-on lab: Work with JSON and BigQuery
- Compare ETL, ELT, and EL processes
- Improve data quality with built-in tools
- Execute batch data operations in BigQuery
- Demo: Improve pipeline quality using ELT
- Explore Hadoop vs. Dataproc
- Migrate from HDFS to GCS
- Tune Spark clusters for performance
- Run big data jobs using Apache Spark
- Hands-on lab: Spark processing on Cloud Dataproc
- Build Dataflow pipelines for batch and streaming
- Use templates, side inputs, and autoscaling
- Hands-on lab: Build and run Dataflow pipelines
- Create visual pipelines in Data Fusion
- Use Cloud Composer and Apache Airflow
- Schedule and monitor DAGs
- Hands-on lab: Orchestrate a data pipeline
- Understand streaming vs. batch processing
- Identify tools and use cases for real-time analytics
- Use Cloud Pub/Sub for streaming messaging
- Understand architecture and security controls
- Hands-on lab: Stream data to Pub/Sub
- Expand pipelines to support streaming use cases
- Monitor and troubleshoot live streams
- Hands-on lab: Real-time data processing
- Ingest live data into BigQuery
- Analyze patterns using dashboards
- Leverage Cloud Bigtable for fast I/O
- Hands-on lab: Build a streaming pipeline
- Use advanced SQL and GIS features
- Tune complex queries for efficiency
- Optional: Partition tables by date
- Understand AI in analytics workflows
- Compare machine learning tools in GCP
- Prepare data for model development
- Use Natural Language API and Vision API
- Hands-on lab: Analyze unstructured text
- Use Jupyter notebooks in Google Cloud
- Analyze BigQuery data with Pandas
- Visualize results with Python
- Hands-on lab: Build reports in notebooks
- Build scalable ML workflows
- Use pipeline templates from AI Hub
- Hands-on lab: Train and monitor models
- Train models using SQL with BigQuery ML
- Compare regression and classification types
- Demo: Predict taxi fares using BigQuery ML
- Create models using AutoML Tables, Vision, NLP
- Evaluate model performance with minimal coding