Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › peer-review
The present article addresses the problem of a hotel deduplication. Obvious approaches, such as name or location comparisons, fail, because hotel descriptions differ among different databases. The most accurate approach to solve this problem is to use the professionally trained content managers, but it is expensive, hence an automatic solution should be implemented. We propose a method to improve a hypothesis that a pair of hotels is identical, and compare its performance with alternative solutions. The proposed method satisfies business requirements set for the precision and recall of the hotel deduplication task. The method is based on machine learning approach with the use of some unique features, including those built with the help of computer vision algorithms.
Original language | English |
---|---|
Title of host publication | Knowledge Engineering and Semantic Web - 7th International Conference, KESW 2016, Proceedings |
Editors | Axel-Cyrille Ngonga Ngomo, Petr Křemen |
Publisher | Springer Nature |
Pages | 230-240 |
Number of pages | 11 |
ISBN (Print) | 9783319458793 |
DOIs | |
State | Published - 2016 |
Event | 7th International Conference on Knowledge Engineering and Semantic Web, KESW 2016 - Prague, Czech Republic Duration: 21 Sep 2016 → 23 Sep 2016 |
Name | Communications in Computer and Information Science |
---|---|
Volume | 649 |
ISSN (Print) | 1865-0929 |
Conference | 7th International Conference on Knowledge Engineering and Semantic Web, KESW 2016 |
---|---|
Country/Territory | Czech Republic |
City | Prague |
Period | 21/09/16 → 23/09/16 |
ID: 86415654