Contextual document clustering › SPbU Researchers Portal

Vladimir Dobrynin
David Patterson
Niall Rooney

In this paper we present a novel algorithm for document clustering. This approach is based on distributional clustering where subject related words, which have a narrow context, are identified to form metatags for that subject. These contextual words form the basis for creating thematic clusters of documents. In a similar fashion to other research papers on document clustering, we analyze the quality of this approach with respect to document categorization problems and show it to outperform the information theoretic method of sequential information bottleneck.

Original language	English
Pages (from-to)	167-180
Number of pages	14
Journal	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	2997
State	Published - 1 Dec 2004

Scopus subject areas

Theoretical Computer Science
Computer Science(all)

ID: 36369783