Research output: Contribution to journal › Article
Corpus-Based Information Extraction and Opinion Mining for the Restaurant Recommendation System. / Pronoza, E.; Yagunova, E.; Volskaya, S.
In: Lecture Notes in Computer Science, Vol. 8791, 2014, p. 272-284.Research output: Contribution to journal › Article
}
TY - JOUR
T1 - Corpus-Based Information Extraction and Opinion Mining for the Restaurant Recommendation System
AU - Pronoza, E.
AU - Yagunova, E.
AU - Volskaya, S.
PY - 2014
Y1 - 2014
N2 - In this paper corpus-based information extraction and opinion mining method is proposed. Our domain is restaurant reviews, and our information extraction and opinion mining module is a part of a Russian knowledge-based recommendation system. Our method is based on thorough corpus analysis and automatic selection of machine learning models and feature sets. We also pay special attention to the verification of statistical significance. According to the results of the research, Naive Bayes models perform well at classifying sentiment with respect to a restaurant aspect, while Logistic Regression is good at deciding on the relevance of a user’s review. The approach proposed can be used in similar domains, for example, hotel reviews, with data represented by colloquial non-structured texts (in contrast with the domain of technical products, books, etc.) and for other languages with rich morphology and free word order.
AB - In this paper corpus-based information extraction and opinion mining method is proposed. Our domain is restaurant reviews, and our information extraction and opinion mining module is a part of a Russian knowledge-based recommendation system. Our method is based on thorough corpus analysis and automatic selection of machine learning models and feature sets. We also pay special attention to the verification of statistical significance. According to the results of the research, Naive Bayes models perform well at classifying sentiment with respect to a restaurant aspect, while Logistic Regression is good at deciding on the relevance of a user’s review. The approach proposed can be used in similar domains, for example, hotel reviews, with data represented by colloquial non-structured texts (in contrast with the domain of technical products, books, etc.) and for other languages with rich morphology and free word order.
KW - Information extraction
KW - Opinion mining
KW - Restaurant recommendation system
KW - Machine learning
U2 - 10.1007/978-3-319-11397-5_21
DO - 10.1007/978-3-319-11397-5_21
M3 - Article
VL - 8791
SP - 272
EP - 284
JO - Lecture Notes in Computer Science
JF - Lecture Notes in Computer Science
SN - 0302-9743
ER -
ID: 5730060