Home NVIDIA Training CoursesFundamentals of Accelerated Computing with CUDA Python

Fundamentals of Accelerated Computing with CUDA Python

Guaranteed to Run

Price

$500.00

Duration

1 Day

Delivery Methods

Virtual Instructor Led Private Group

Delivery

Virtual

EST

Description

Objectives

Prerequisites

Content

Course Description

This workshop teaches you the fundamental tools and techniques for running GPU-accelerated Python applications using CUDA® GPUs and the Numba compiler. You’ll work though dozens of hands-on coding exercises and, at the end of the training, implement a new workflow to accelerate a fully functional linear algebra program originally designed for CPUs, observing impressive performance gains. After the workshop ends, you’ll have additional resources to help you create new GPU-accelerated applications on your own.

Course Objectives

GPU-accelerate NumPy ufuncs with a few lines of code.
Configure code parallelization using the CUDA thread hierarchy.
Write custom CUDA device kernels for maximum performance and flexibility.
Use memory coalescing and on-device shared memory to increase CUDA kernel bandwidth.

Who Should Attend?

Developers who use Python

Course Prerequisites

Basic Python competency, including familiarity with variable types, loops, conditional statements, functions, and array manipulations
NumPy competency, including the use of ndarrays and ufuncs
No previous knowledge of CUDA programming is required

Course Content

Module 1: Course Introduction

Module 2: Introduction to CUDA Python with Numba

Module 3: Custom CUDA Kernels in Python with Numba

Module 4: Multidimensional Grids and Shared Memory

Module 5: Final Review

Do You Need Help? Please Fill Out The Form Below