The ability to perform an exploratory search and retrieval of relevant documents from a large collection of domain-specific documents is an important requirement both in the field of medicine and other areas. In this paper, we present a unsupervised distributional clustering technique called SOPHIA. SOPHIA provides a semantically meaningful visual clustering of the document corpus in conjunction with an intuitive interactive search facility. We assess the effectiveness of SOPHIA's cluster-based information retrieval for the MEDLINE testset collection known as OHSUMED.

Original languageEnglish
Pages (from-to)256-265
Number of pages10
JournalIEEE Transactions on Information Technology in Biomedicine
Issue number2
StatePublished - 1 Jun 2005

    Research areas

  • Clustering, Information retrieval, MEDLINE

    Scopus subject areas

  • Biotechnology
  • Computer Science Applications
  • Electrical and Electronic Engineering

ID: 36369275