Document Type

Conference Paper

Department/Unit

Department of Computer Science

Title

Learning hidden Markov model topology based on KL divergence for information extraction

Language

English

Abstract

To locate information embedded in documents, information extraction systems based on rule-based pattern matching have long been used. To further improve the extraction generalization, hidden Markov model (HMM) has recently been adopted for modeling temporal variations of the target patterns with promising results. In this paper, a state-merging method is adopted for learning the topology with the use of a localized Kullback Leibler (KL) divergence. The proposed system has been applied to a set of domain-specific job advertisements and preliminary experiments show promising results.

Keywords

Hide Markov Model, Information Extraction, Kullback Leibler, Target Pattern, Information Extraction System

Publication Date

5-2004

Source Publication Title

Advances in Knowledge Discovery and Data Mining 8th Pacific-Asia Conference, PAKDD 2004, Sydney, Australia, May 26-28, 2004. Proceedings

Editors

Dai, Honghua ; Srikant, Ramakrishnan ; Zhang, Chengqi

Start Page

590

End Page

594

Series Title

Lecture notes in computer science, 3056.; Lecture notes in computer science., Lecture notes in artificial intelligence.

Conference Location

Sydney, Australia

Publisher

Springer

Peer Reviewed

1

Copyright

© Springer-Verlag Berlin Heidelberg 2004

DOI

10.1007/978-3-540-24775-3_70

ISBN (print)

9783540220640

ISBN (electronic)

9783540247753

This document is currently not available here.

Share

COinS