Energy efficiency is becoming increasingly important for computing systems, in particular for large scale High Performance Computing (HPC) facilities. In this work, we evaluate, from a user perspective, the use of Dynamic Voltage and Frequency Scaling techniques, assisted by the power and energy monitoring capabilities of modern processors to tune applications for energy efficiency. We run selected kernels and a full HPC application on 2 high-end processors widely used in the HPC context, namely, an NVIDIA K80 GPU and an Intel Haswell CPU. We evaluate the available trade-offs between energy-to-solution and time-to-solution, attempting a function-by-function frequency tuning. We finally estimate the benefits obtainable running the full code on an HPC multi-GPU node, with respect to default clock frequency governors. We instrument our code to accurately monitor power consumption and execution time without the need of any additional hardware, and we enable it to change CPUs and GPUs clock frequencies while running. We analyze our results on the different architectures using a simple energy-performance model and derive a number of energy saving strategies, which can be easily adopted on recent high-end HPC systems for generic applications.
Evaluation of DVFS techniques on modern HPC processors and accelerators for energy-aware applications
Calore, Enrico;GABBANA, Alessandro;SCHIFANO, Sebastiano Fabio;TRIPICCIONE, Raffaele
2017
Abstract
Energy efficiency is becoming increasingly important for computing systems, in particular for large scale High Performance Computing (HPC) facilities. In this work, we evaluate, from a user perspective, the use of Dynamic Voltage and Frequency Scaling techniques, assisted by the power and energy monitoring capabilities of modern processors to tune applications for energy efficiency. We run selected kernels and a full HPC application on 2 high-end processors widely used in the HPC context, namely, an NVIDIA K80 GPU and an Intel Haswell CPU. We evaluate the available trade-offs between energy-to-solution and time-to-solution, attempting a function-by-function frequency tuning. We finally estimate the benefits obtainable running the full code on an HPC multi-GPU node, with respect to default clock frequency governors. We instrument our code to accurately monitor power consumption and execution time without the need of any additional hardware, and we enable it to change CPUs and GPUs clock frequencies while running. We analyze our results on the different architectures using a simple energy-performance model and derive a number of energy saving strategies, which can be easily adopted on recent high-end HPC systems for generic applications.File | Dimensione | Formato | |
---|---|---|---|
1703.02788.pdf
accesso aperto
Descrizione: pre print arXive
Tipologia:
Pre-print
Dimensione
803.29 kB
Formato
Adobe PDF
|
803.29 kB | Adobe PDF | Visualizza/Apri |
I documenti in SFERA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.