Research output: Contribution to journal › Article
Automatic stop list generation for clustering recognition results of call center recordings. / Popova, S.; Krivosheeva, T.; Korenevsky, M.
In: Lecture Notes in Computer Science, Vol. 8773, 2014, p. 137-144.Research output: Contribution to journal › Article
}
TY - JOUR
T1 - Automatic stop list generation for clustering recognition results of call center recordings
AU - Popova, S.
AU - Krivosheeva, T.
AU - Korenevsky, M.
PY - 2014
Y1 - 2014
N2 - The paper deals with the problem of automatic stop list generation for processing recognition results of call center recordings, in particular for the purpose of clustering. We propose and test a supervised domain dependent method of automatic stop list generation. The method is based on finding words whose removal increases the dissimilarity between documents in different clusters, and decreases dissimilarity between documents within the same cluster. This approach is shown to be efficient for clustering recognition results of recordings with different quality, both on datasets that contain the same topics as the training dataset, and on datasets containing other topics.
AB - The paper deals with the problem of automatic stop list generation for processing recognition results of call center recordings, in particular for the purpose of clustering. We propose and test a supervised domain dependent method of automatic stop list generation. The method is based on finding words whose removal increases the dissimilarity between documents in different clusters, and decreases dissimilarity between documents within the same cluster. This approach is shown to be efficient for clustering recognition results of recordings with different quality, both on datasets that contain the same topics as the training dataset, and on datasets containing other topics.
KW - ASR
KW - Clustering
KW - Stop list generation
KW - Stop Words
M3 - статья
VL - 8773
SP - 137
EP - 144
JO - Lecture Notes in Computer Science
JF - Lecture Notes in Computer Science
SN - 0302-9743
ER -
ID: 5746798