Document Type

Journal Article

Department/Unit

Department of Computer Science

Title

Speeding up k-means algorithm by GPUs

Language

English

Abstract

Cluster analysis plays a critical role in a wide variety of applications, but it is now facing the computational challenge due to the continuously increasing data volume. Parallel computing is one of the most promising solutions to overcoming the computational challenge. In this paper, we target at parallelizing k-Means, which is one of the most popular clustering algorithms, by using the widely available Graphics Processing Units (GPUs). Different from existing GPU-based k-Means algorithms, we observe that data dimension is an important factor that should be taken into consideration when parallelizing k-Means on GPUs. In particular, we use two different strategies for low-dimensional data sets and high-dimensional data sets respectively, in order to make the best use of the power of GPUs. For low-dimensional data sets, we exploit GPU on-chip registers to significantly decrease data access latency. For highdimensional data sets, we design a novel algorithm which simulates matrix multiplication and exploits GPU on-chip registers and also on-chip shared memory to achieve high compute-to-memory-access ratio. As a result, our GPU-based k-Means algorithm is three to eight times faster than the best reported GPU-based algorithm. © 2010 IEEE.

Keywords

Cluster, CUDA, GPGPU, K-means

Publication Date

2013

Source Publication Title

Journal of Computer and System Sciences

Volume

79

Issue

2

Start Page

216

End Page

229

Publisher

Elsevier

DOI

10.1016/j.jcss.2012.05.004

Link to Publisher's Edition

http://dx.doi.org/10.1016/j.jcss.2012.05.004

ISSN (print)

00220000

This document is currently not available here.

Share

COinS