Modeling and visualization of media in Arabic

Research output

9 Citations (Scopus)

Abstract

In this paper, a novel method for analyzing media in Arabic using new quantitative characteristics is proposed. A sequence of newspaper daily issues is represented as histograms of occurrences of informative terms. The histograms closeness is evaluated via a rank correlation coefficient by treating the terms as ordinal data consistent with their frequencies. A new characteristic is introduced to quantify the relationship of an issue with numerous earlier ones. A newspaper is imaged as a time series of this characteristic values affected by the current social situation. The change points of this process may indicate fluctuations in the social behavior of the corresponding society as is evident from changes in the linguistic content. Moreover, the similarity measure created by means of this characteristic makes it possible to accurately derive the groups of homogeneous issues without any additional information. The methodology is evaluated on sequential issues of an Egyptian newspaper, Al-Ahraam, and a Lebanese newspaper, Al-Akhbaar. The results exhibit the high ability of the proposed approach to expose changes in the linguistic content and to connect them with changes in the structure of society and the relationships in it. The method can be suitably extended to every alphabetic language media. (C) 2016 Elsevier Ltd. All rights reserved.

Original languageEnglish
Pages (from-to)439-453
Number of pages15
JournalJournal of Informetrics
Volume10
Issue number2
DOIs
Publication statusPublished - May 2016

Fingerprint

Linguistics
visualization
newspaper
Visualization
Time series
Structure of Society
linguistics
social situation
social behavior
fluctuation
time series
methodology
ability
language
Values
Group

Cite this

@article{5d02ab77186d415fb6be5a53b9a9c919,
title = "Modeling and visualization of media in Arabic",
abstract = "In this paper, a novel method for analyzing media in Arabic using new quantitative characteristics is proposed. A sequence of newspaper daily issues is represented as histograms of occurrences of informative terms. The histograms closeness is evaluated via a rank correlation coefficient by treating the terms as ordinal data consistent with their frequencies. A new characteristic is introduced to quantify the relationship of an issue with numerous earlier ones. A newspaper is imaged as a time series of this characteristic values affected by the current social situation. The change points of this process may indicate fluctuations in the social behavior of the corresponding society as is evident from changes in the linguistic content. Moreover, the similarity measure created by means of this characteristic makes it possible to accurately derive the groups of homogeneous issues without any additional information. The methodology is evaluated on sequential issues of an Egyptian newspaper, Al-Ahraam, and a Lebanese newspaper, Al-Akhbaar. The results exhibit the high ability of the proposed approach to expose changes in the linguistic content and to connect them with changes in the structure of society and the relationships in it. The method can be suitably extended to every alphabetic language media. (C) 2016 Elsevier Ltd. All rights reserved.",
keywords = "Media quantization, Visualization, Arabic text segmentation, N-GRAMS",
author = "Zeev Volkovich and Oleg Granichin and Oleg Redkin and Olga Bernikova",
year = "2016",
month = "5",
doi = "10.1016/j.joi.2016.02.008",
language = "Английский",
volume = "10",
pages = "439--453",
journal = "Journal of Informetrics",
issn = "1751-1577",
publisher = "Elsevier",
number = "2",

}

TY - JOUR

T1 - Modeling and visualization of media in Arabic

AU - Volkovich, Zeev

AU - Granichin, Oleg

AU - Redkin, Oleg

AU - Bernikova, Olga

PY - 2016/5

Y1 - 2016/5

N2 - In this paper, a novel method for analyzing media in Arabic using new quantitative characteristics is proposed. A sequence of newspaper daily issues is represented as histograms of occurrences of informative terms. The histograms closeness is evaluated via a rank correlation coefficient by treating the terms as ordinal data consistent with their frequencies. A new characteristic is introduced to quantify the relationship of an issue with numerous earlier ones. A newspaper is imaged as a time series of this characteristic values affected by the current social situation. The change points of this process may indicate fluctuations in the social behavior of the corresponding society as is evident from changes in the linguistic content. Moreover, the similarity measure created by means of this characteristic makes it possible to accurately derive the groups of homogeneous issues without any additional information. The methodology is evaluated on sequential issues of an Egyptian newspaper, Al-Ahraam, and a Lebanese newspaper, Al-Akhbaar. The results exhibit the high ability of the proposed approach to expose changes in the linguistic content and to connect them with changes in the structure of society and the relationships in it. The method can be suitably extended to every alphabetic language media. (C) 2016 Elsevier Ltd. All rights reserved.

AB - In this paper, a novel method for analyzing media in Arabic using new quantitative characteristics is proposed. A sequence of newspaper daily issues is represented as histograms of occurrences of informative terms. The histograms closeness is evaluated via a rank correlation coefficient by treating the terms as ordinal data consistent with their frequencies. A new characteristic is introduced to quantify the relationship of an issue with numerous earlier ones. A newspaper is imaged as a time series of this characteristic values affected by the current social situation. The change points of this process may indicate fluctuations in the social behavior of the corresponding society as is evident from changes in the linguistic content. Moreover, the similarity measure created by means of this characteristic makes it possible to accurately derive the groups of homogeneous issues without any additional information. The methodology is evaluated on sequential issues of an Egyptian newspaper, Al-Ahraam, and a Lebanese newspaper, Al-Akhbaar. The results exhibit the high ability of the proposed approach to expose changes in the linguistic content and to connect them with changes in the structure of society and the relationships in it. The method can be suitably extended to every alphabetic language media. (C) 2016 Elsevier Ltd. All rights reserved.

KW - Media quantization

KW - Visualization

KW - Arabic text segmentation

KW - N-GRAMS

U2 - 10.1016/j.joi.2016.02.008

DO - 10.1016/j.joi.2016.02.008

M3 - статья

VL - 10

SP - 439

EP - 453

JO - Journal of Informetrics

JF - Journal of Informetrics

SN - 1751-1577

IS - 2

ER -