Etd

Concurrent Deep Learning Workloads on NVIDIA GPUs

Public

Downloadable Content

open in viewer

Deep learning GPU servers that execute latency-sensitive inference requests from clients often seek to run training tasks alongside inference when there are idle resources in order to improve overall system utilization. We empirically derive the thread block scheduler’s behavior under such concurrent workloads for NVIDIA’s Pascal, Volta, and Turing microarchitectures. In contrast to past studies that suggest the scheduler uses a round-robin policy to assign thread blocks to streaming multiprocessors (SMs), we instead find that the scheduler chooses the next SM based on the SM’s local resource availability. We show how this scheduling policy can lead to significant, and seemingly counter-intuitive, performance degradation; for example, a decrease of one thread per block resulted in a 3.58X increase in execution time for one kernel in our experiments. We then investigate the performance of current concurrency mechanisms on NVIDIA’s new Ampere microarchitecture under deep learning workloads and demonstrate that fluctuating resource requirements and kernel runtimes make executing such workloads while maintaining consistently high utilization and low, predictable turnaround times difficult on current NVIDIA hardware. Moreover, we conclude that the lack of sufficiently flexible preemption policies, robust task prioritization mechanisms, and contention-aware thread block scheduling techniques limits the effectiveness of NVIDIA’s concurrency mechanisms. We estimate that through the use of block-level, contention-aware preemption, it is possible to achieve 1.5X speedups in turnaround time with comparable utilization and improved predictability, as long as preemption overhead remains under 1-2ms.

Creator
Contributors
Degree
Unit
Publisher
Identifier
  • etd-21846
Keyword
Advisor
Defense date
Year
  • 2021
Date created
  • 2021-05-04
Resource type
Rights statement
Last modified
  • 2023-12-05

Relations

In Collection:

Items

Items

Permanent link to this page: https://digital.wpi.edu/show/pn89d976b