Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная › Рецензирование
SENTIMENT ANALYSIS IN ARABIC: LINGUISTIC ISSUES. / Bernikova, O.; Redkin, O.
5th International Multidisciplinary Scientific Conference on Social Sciences and Arts SGEM 2018. STEF92 Technology Ltd., 2018. стр. 407-412.Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная › Рецензирование
}
TY - GEN
T1 - SENTIMENT ANALYSIS IN ARABIC: LINGUISTIC ISSUES
AU - Bernikova, O.
AU - Redkin, O.
PY - 2018/3
Y1 - 2018/3
N2 - The paper deals with the linguistic peculiarities of sentiment analysis of documents in Arabic. Automatic definition of emotive component in a large corpus is highly relevant today. At the same time, the theory of emotions in linguistics has not been sufficiently developed yet, therefore there is an urgent need to improve the computer methods of sentiment analysis for carrying out sociolinguistic research. In the framework of this study, we try to determine the patterns inherent in the Arabic language, which must be taken into account when conducting Big Data processing. To implement this task, we used the high-frequency word list, developed on the basis of processing of the texts with a volume of 1 million uses. After that, 1000 units were analyzed within the onedimensional emotional space ("positive" - "negative"). As a result it was determined that percentage of emotional vocabulary towards neutral is about 18%; the most representative part of speech in the emotive dictionary is the verb (34%), in approximately equal proportion nouns and verbal nouns (masdars) are represented - 24- 25% respectively, while adjectives constitutes only 16%. It is often difficult to identify a particular sentiment, as its characteristic depends on the context (for example such words as "discipline" can be used in a variety of contexts, as well as the verb "to happen"). The third part of the analyzed emotive vocabulary has a negative characteristic (two-thirds - positive). The most often, “positive vocabulary” is expressed by adjectives. These conclusions may be useful for linguistic research in general, and for the development of automated data processing technologies, in particular.
AB - The paper deals with the linguistic peculiarities of sentiment analysis of documents in Arabic. Automatic definition of emotive component in a large corpus is highly relevant today. At the same time, the theory of emotions in linguistics has not been sufficiently developed yet, therefore there is an urgent need to improve the computer methods of sentiment analysis for carrying out sociolinguistic research. In the framework of this study, we try to determine the patterns inherent in the Arabic language, which must be taken into account when conducting Big Data processing. To implement this task, we used the high-frequency word list, developed on the basis of processing of the texts with a volume of 1 million uses. After that, 1000 units were analyzed within the onedimensional emotional space ("positive" - "negative"). As a result it was determined that percentage of emotional vocabulary towards neutral is about 18%; the most representative part of speech in the emotive dictionary is the verb (34%), in approximately equal proportion nouns and verbal nouns (masdars) are represented - 24- 25% respectively, while adjectives constitutes only 16%. It is often difficult to identify a particular sentiment, as its characteristic depends on the context (for example such words as "discipline" can be used in a variety of contexts, as well as the verb "to happen"). The third part of the analyzed emotive vocabulary has a negative characteristic (two-thirds - positive). The most often, “positive vocabulary” is expressed by adjectives. These conclusions may be useful for linguistic research in general, and for the development of automated data processing technologies, in particular.
KW - Arabic
KW - computer linguistics
KW - sentiment analysis
KW - vocabulary
UR - http://dx.doi.org/10.5593/sgemsocial2018H/31/S10.051 https://sgemworld.at/ssgemlib/spip.php?article5564
UR - http://www.mendeley.com/research/sentiment-analysis-arabic-linguistic-issues
U2 - 10.5593/sgemsocial2018H/31/S10.051
DO - 10.5593/sgemsocial2018H/31/S10.051
M3 - Conference contribution
SN - 978-619-7408-32-4
SP - 407
EP - 412
BT - 5th International Multidisciplinary Scientific Conference on Social Sciences and Arts SGEM 2018
PB - STEF92 Technology Ltd.
ER -
ID: 29082919