Accelerating Data Engineering Pipelines
Guaranteed to Run
Price
$500.00
Duration
1 Day
Delivery Methods
Virtual Instructor Led Private Group
Delivery
Virtual
ESTDescription
Objectives
Prerequisites
Content
Course Description
In this workshop, we’ll explore how GPUs can improve data pipelines and how using advanced data engineering tools and techniques can result in significant performance acceleration. Faster pipelines produce fresher dashboards and machine learning (ML) models, so users can have the most current information at their fingertips.
Course Objectives
- How data moves within a computer. How to build the right balance between CPU, DRAM, Disk Memory, and GPUs.
- How different file formats can be read and manipulated by hardware.
- How to scale an ETL pipeline with multiple GPUs using NVTabular.
- How to build an interactive Plotly dashboard where users can filter on millions of data points in less than a second.
Who Should Attend?
Experienced Python Developers
Course Prerequisites
- Intermediate knowledge of Python (list comprehension, objects)
- Familiarity with pandas a plus
- Introductory statistics (mean, median, mode)
Course Content
Module 1: Course Introduction
Overview of course goals and structure
Module 2: Data on the Hardware Level
Understanding how data is stored and processed at the hardware layer
Module 3: ETL with NVTabular
Applying GPU-accelerated ETL (Extract, Transform, Load) workflows using NVTabular
Module 4: Data Visualization
Techniques and tools for visualizing large datasets efficiently
Module 5: Final Project – Data Detective
Hands-on project applying concepts to analyze and visualize data
Do You Need Help? Please Fill Out The Form Below