Department of Mathematics
A semi-supervised approach to projected clustering with applications to microarray data
Recent studies have suggested that extremely low dimensional projected clusters exist in real datasets. Here, we propose a new algorithm for identifying them. It combines object clustering and dimension selection, and allows the input of domain knowledge in guiding the clustering process. Theoretical and experimental results show that even a small amount of input knowledge could already help detect clusters with only 1% of the relevant dimensions. We also show that this semi-supervised algorithm can perform knowledge-guided selective clustering when there are multiple meaningful object groupings. The algorithm is also shown effective in analysing a microarray dataset.
Source Publication Title
International Journal of Data Mining and Bioinformatics
Yip, Kevin Y, Lin Cheung, David W Cheung, Liping Jing, and Michael K Ng. "A semi-supervised approach to projected clustering with applications to microarray data." International Journal of Data Mining and Bioinformatics 3.3 (2009): 229-259.