Document Type
Conference Paper
Department/Unit
Department of Computer Science
Title
Mining local data sources for learning global cluster models
Language
English
Abstract
Distributed data mining has been a topic getting more important nowadays as there are many cases where physically sharing of data is probibited, e.g., due to huge data volume or data privacy. In this paper, we are interested in learning a global cluster model by exploring data in distributed sources. A methodology based on periodic model exchange and merge is proposed and applied to hyperlinked Web pages analysis. In addition, we have tested a number of variations of the basic idea, including putting more emphasis on the privacy concern and testing the effect of having different numbers of distributed sources. Experimental results show that the proposed distributed learning scheme is effective with accuracy close to the case with all the data physically shared for the learning.
Keywords
Data mining, Machine learning, Data analysis, Data privacy, Web pages, Testing, Frequency, Computer science, Machine learning algorithms, Training data
Publication Date
9-2004
Source Publication Title
Proceedings / IEEE/WIC/ACM International Conference on Web Intelligence, 2004, WI 2004 20 - 24 Sept. 2004, [Beijing, China]
Editors
Zhong, Ning ; Tirri, Henry ; Yao, Yiyu ; Zhou, Lizhu ; Liu, Jiming ; Cercone, Nick
Conference Location
Beijing, China
Publisher
IEEE
Peer Reviewed
1
Copyright
Copyright © 2004 by The Institute of Electrical and Electronics Engineers, Inc.
Funder
This research work is partially supported by Hong Kong Baptist University under FRG/03-04/II-20.
DOI
10.1109/WI.2004.10044
Link to Publisher's Edition
ISBN (print)
9780769521008
Recommended Citation
Lam, Chak-Man, Xiao-Feng Zhang, and William K. Cheung. "Mining local data sources for learning global cluster models." Proceedings / IEEE/WIC/ACM International Conference on Web Intelligence, 2004, WI 2004 20 - 24 Sept. 2004, [Beijing, China] (2004).