Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
DetIE: Multilingual Open Information Extraction Inspired by Object Detection. / Vasilkovsky, Michael; Alekseev, Anton; Malykh, Valentin; Shenbin, Ilya; Tutubalina, Elena; Salikhov, Dmitriy; Stepnov, Mikhail; Chertok, Andrey; Nikolenko, Sergey.
Proceedings of the 36th AAAI Conference on Artificial Intelligence, 36: AAAI-22 Technical Tracks 10. 2022. p. 11412-11420.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - DetIE: Multilingual Open Information Extraction Inspired by Object Detection
AU - Vasilkovsky, Michael
AU - Alekseev, Anton
AU - Malykh, Valentin
AU - Shenbin, Ilya
AU - Tutubalina, Elena
AU - Salikhov, Dmitriy
AU - Stepnov, Mikhail
AU - Chertok, Andrey
AU - Nikolenko, Sergey
PY - 2022
Y1 - 2022
N2 - State of the art neural methods for open information extraction (OpenIE) usually extract triplets (or tuples) iteratively in an autoregressive or predicate-based manner in order not to produce duplicates. In this work, we propose a different approach to the problem that can be equally or more successful. Namely, we present a novel single-pass method for OpenIE inspired by object detection algorithms from computer vision. We use an order-agnostic loss based on bipartite matching that forces unique predictions and a Transformer-based encoder-only architecture for sequence labeling. The proposed approach is faster and shows superior or similar performance in comparison with state of the art models on standard benchmarks in terms of both quality metrics and inference time. Our model sets the new state of the art performance of 67.7% F1 on CaRB evaluated as OIE2016 while being 3.35x faster at inference than previous state of the art. We also evaluate the multilingual version of our model in the zero-shot setting for two languages and introduce a strategy for generating synthetic multilingual data to fine-tune the model for each specific language. In this setting, we show performance improvement 15% on multilingual Re-OIE2016, reaching 75% F1 for both Portuguese and Spanish languages. Code and models are available at https://github.com/sberbank-ai/DetIE.
AB - State of the art neural methods for open information extraction (OpenIE) usually extract triplets (or tuples) iteratively in an autoregressive or predicate-based manner in order not to produce duplicates. In this work, we propose a different approach to the problem that can be equally or more successful. Namely, we present a novel single-pass method for OpenIE inspired by object detection algorithms from computer vision. We use an order-agnostic loss based on bipartite matching that forces unique predictions and a Transformer-based encoder-only architecture for sequence labeling. The proposed approach is faster and shows superior or similar performance in comparison with state of the art models on standard benchmarks in terms of both quality metrics and inference time. Our model sets the new state of the art performance of 67.7% F1 on CaRB evaluated as OIE2016 while being 3.35x faster at inference than previous state of the art. We also evaluate the multilingual version of our model in the zero-shot setting for two languages and introduce a strategy for generating synthetic multilingual data to fine-tune the model for each specific language. In this setting, we show performance improvement 15% on multilingual Re-OIE2016, reaching 75% F1 for both Portuguese and Spanish languages. Code and models are available at https://github.com/sberbank-ai/DetIE.
KW - natural language processing
KW - open information extraction
KW - information extraction
KW - multilingual models
KW - deep learning
KW - object detection
UR - https://arxiv.org/search/cs?searchtype=author&query=Vasilkovsky%2C+M
UR - https://aaai.org/papers/11412-detie-multilingual-open-information-extraction-inspired-by-object-detection/
M3 - Conference contribution
SP - 11412
EP - 11420
BT - Proceedings of the 36th AAAI Conference on Artificial Intelligence, 36
T2 - 36th AAAI Conference on Artificial Intelligence
Y2 - 22 February 2022 through 1 March 2022
ER -
ID: 95169671