Standard

SOPHIA-TCBR : A knowledge discovery framework for textual case-based reasoning. / Patterson, David; Rooney, Niall; Galushka, Mykola; Dobrynin, Vladimir; Smirnova, Elena.

In: Knowledge-Based Systems, Vol. 21, No. 5, 01.07.2008, p. 404-414.

Research output: Contribution to journalArticlepeer-review

Harvard

Patterson, D, Rooney, N, Galushka, M, Dobrynin, V & Smirnova, E 2008, 'SOPHIA-TCBR: A knowledge discovery framework for textual case-based reasoning', Knowledge-Based Systems, vol. 21, no. 5, pp. 404-414. https://doi.org/10.1016/j.knosys.2008.02.006

APA

Patterson, D., Rooney, N., Galushka, M., Dobrynin, V., & Smirnova, E. (2008). SOPHIA-TCBR: A knowledge discovery framework for textual case-based reasoning. Knowledge-Based Systems, 21(5), 404-414. https://doi.org/10.1016/j.knosys.2008.02.006

Vancouver

Patterson D, Rooney N, Galushka M, Dobrynin V, Smirnova E. SOPHIA-TCBR: A knowledge discovery framework for textual case-based reasoning. Knowledge-Based Systems. 2008 Jul 1;21(5):404-414. https://doi.org/10.1016/j.knosys.2008.02.006

Author

Patterson, David ; Rooney, Niall ; Galushka, Mykola ; Dobrynin, Vladimir ; Smirnova, Elena. / SOPHIA-TCBR : A knowledge discovery framework for textual case-based reasoning. In: Knowledge-Based Systems. 2008 ; Vol. 21, No. 5. pp. 404-414.

BibTeX

@article{92637304da7c4838b94c48df63a721ff,
title = "SOPHIA-TCBR: A knowledge discovery framework for textual case-based reasoning",
abstract = "In this paper, we present a novel textual case-based reasoning system called SOPHIA-TCBR which provides a means of clustering semantically related textual cases where individual clusters are formed through the discovery of narrow themes which then act as attractors for related cases. During this process, SOPHIA-TCBR automatically discovers appropriate case and similarity knowledge. It then is able to organize the cases within each cluster by forming a minimum spanning tree, based on their semantic similarity. SOPHIA's capability as a case-based text classifier is benchmarked against the well known and widely utilised k-Means approach. Results show that SOPHIA either equals or outperforms k-Means based on 2 different case-bases, and as such is an attractive approach for case-based classification. We demonstrate the quality of the knowledge discovery process by showing the high level of topic similarity between adjacent cases within the minimum spanning tree. We show that the formation of the minimum spanning tree makes it possible to identify a kernel region within the cluster, which has a higher level of similarity between cases than the cluster in its entirety, and that this corresponds directly to a higher level of topic homogeneity. We demonstrate that the topic homogeneity increases as the average semantic similarity between cases in the kernel increases. Finally having empirically demonstrated the quality of the knowledge discovery process in SOPHIA, we show how it can be competently applied to case-based retrieval.",
keywords = "Clustering, Knowledge discovery, Textual case-based reasoning",
author = "David Patterson and Niall Rooney and Mykola Galushka and Vladimir Dobrynin and Elena Smirnova",
year = "2008",
month = jul,
day = "1",
doi = "10.1016/j.knosys.2008.02.006",
language = "English",
volume = "21",
pages = "404--414",
journal = "Knowledge-Based Systems",
issn = "0950-7051",
publisher = "Elsevier",
number = "5",

}

RIS

TY - JOUR

T1 - SOPHIA-TCBR

T2 - A knowledge discovery framework for textual case-based reasoning

AU - Patterson, David

AU - Rooney, Niall

AU - Galushka, Mykola

AU - Dobrynin, Vladimir

AU - Smirnova, Elena

PY - 2008/7/1

Y1 - 2008/7/1

N2 - In this paper, we present a novel textual case-based reasoning system called SOPHIA-TCBR which provides a means of clustering semantically related textual cases where individual clusters are formed through the discovery of narrow themes which then act as attractors for related cases. During this process, SOPHIA-TCBR automatically discovers appropriate case and similarity knowledge. It then is able to organize the cases within each cluster by forming a minimum spanning tree, based on their semantic similarity. SOPHIA's capability as a case-based text classifier is benchmarked against the well known and widely utilised k-Means approach. Results show that SOPHIA either equals or outperforms k-Means based on 2 different case-bases, and as such is an attractive approach for case-based classification. We demonstrate the quality of the knowledge discovery process by showing the high level of topic similarity between adjacent cases within the minimum spanning tree. We show that the formation of the minimum spanning tree makes it possible to identify a kernel region within the cluster, which has a higher level of similarity between cases than the cluster in its entirety, and that this corresponds directly to a higher level of topic homogeneity. We demonstrate that the topic homogeneity increases as the average semantic similarity between cases in the kernel increases. Finally having empirically demonstrated the quality of the knowledge discovery process in SOPHIA, we show how it can be competently applied to case-based retrieval.

AB - In this paper, we present a novel textual case-based reasoning system called SOPHIA-TCBR which provides a means of clustering semantically related textual cases where individual clusters are formed through the discovery of narrow themes which then act as attractors for related cases. During this process, SOPHIA-TCBR automatically discovers appropriate case and similarity knowledge. It then is able to organize the cases within each cluster by forming a minimum spanning tree, based on their semantic similarity. SOPHIA's capability as a case-based text classifier is benchmarked against the well known and widely utilised k-Means approach. Results show that SOPHIA either equals or outperforms k-Means based on 2 different case-bases, and as such is an attractive approach for case-based classification. We demonstrate the quality of the knowledge discovery process by showing the high level of topic similarity between adjacent cases within the minimum spanning tree. We show that the formation of the minimum spanning tree makes it possible to identify a kernel region within the cluster, which has a higher level of similarity between cases than the cluster in its entirety, and that this corresponds directly to a higher level of topic homogeneity. We demonstrate that the topic homogeneity increases as the average semantic similarity between cases in the kernel increases. Finally having empirically demonstrated the quality of the knowledge discovery process in SOPHIA, we show how it can be competently applied to case-based retrieval.

KW - Clustering

KW - Knowledge discovery

KW - Textual case-based reasoning

UR - http://www.scopus.com/inward/record.url?scp=44149087861&partnerID=8YFLogxK

U2 - 10.1016/j.knosys.2008.02.006

DO - 10.1016/j.knosys.2008.02.006

M3 - Article

AN - SCOPUS:44149087861

VL - 21

SP - 404

EP - 414

JO - Knowledge-Based Systems

JF - Knowledge-Based Systems

SN - 0950-7051

IS - 5

ER -

ID: 36368827