Document Type

Conference Paper

Department/Unit

Department of Computer Science

Title

Identifying recurrent and unknown performance issues

Language

English

Abstract

© 2014 IEEE.For a large-scale software system, especially an online service system, when a performance issue occurs, it is desirable to check whether this issue has occurred before. If there are past similar issues, a known remedy could be applied. Otherwise, a new troubleshooting process may have to be initiated. The symptom of a performance issue can be characterized by a set of metrics. Due to the sophisticated nature of software systems, manual diagnosis of performance issues based on metric data is typically expensive and laborious. In this paper, we propose a Hidden Markov Random Field (HMRF) based approach to automatic identification of recurrent and unknown performance issues. We formulate the problem of issue identification as a HMRF-based clustering problem. Our approach incorporates the learning of metric discretization thresholds and the optimization of issue clustering. Based on the learned thresholds and cluster centroids, we can achieve accurate identification of recurrent issues and unknown issues. Experimental evaluations on an open benchmark and a large-scale industrial production system show that our approach is effective and outperforms the related state-of-the-art approaches.

Keywords

automated diagnosis, duplication detection, Issue identification, metrics, performance

Publication Date

2014

Source Publication Title

14th IEEE International Conference on Data Mining

Start Page

320

End Page

329

Conference Location

Shenzhen, China

Publisher

IEEE

DOI

10.1109/ICDM.2014.96

Link to Publisher's Edition

http://dx.doi.org/10.1109/ICDM.2014.96

ISSN (print)

15504786

ISBN (print)

9781479943036

This document is currently not available here.

Share

COinS