http://dx.doi.org/10.1007/978-3-662-44917-2_13">
 

Document Type

Conference Paper

Department/Unit

Department of Computer Science

Title

Benchmarking the memory hierarchy of modern GPUs

Language

English

Abstract

Memory access efficiency is a key factor for fully exploiting the computational power of Graphics Processing Units (GPUs). However, many details of the GPU memory hierarchy are not released by the vendors. We propose a novel fine-grained benchmarking approach and apply it on two popular GPUs, namely Fermi and Kepler, to expose the previously unknown characteristics of their memory hierarchies. Specifically, we investigate the structures of different cache systems, such as data cache, texture cache, and the translation lookaside buffer (TLB). We also investigate the impact of bank conflict on shared memory access latency. Our benchmarking results offer a better understanding on the mysterious GPU memory hierarchy, which can help in the software optimization and the modelling of GPU architectures. Our source code and experimental results are publicly available. © 2014 IFIP International Federation for Information Processing.

Publication Date

2014

Source Publication Title

Network and Parallel Computing: 11th IFIP WG 10.3 International Conference, NPC 2014, Ilan, Taiwan, September 18-20, 2014. Proceedings

Start Page

144

End Page

156

Conference Location

Ilan, Taiwan

Publisher

Springer Berlin Heidelberg

ISBN (print)

9783662449165

ISBN (electronic)

9783662449172

This document is currently not available here.

Share

COinS