Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование
Extending the applicabilityof the Zipf’s laws to the sequences of byte data. / Сергеев, Сергей Львович; Блеканов, Иван Станиславович; Ежов, Федор Валерьевич; Тарасов, Никита Андреевич.
в: Вестник Санкт-Петербургского университета. Прикладная математика. Информатика. Процессы управления, Том 20, № 3, 2024, стр. 391–403.Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование
}
TY - JOUR
T1 - Extending the applicabilityof the Zipf’s laws to the sequences of byte data
AU - Сергеев, Сергей Львович
AU - Блеканов, Иван Станиславович
AU - Ежов, Федор Валерьевич
AU - Тарасов, Никита Андреевич
PY - 2024
Y1 - 2024
N2 - Zipf’s law have been shown to hold true in many places. From it’s first idea of a statistical phenomenon related to natural language to it’s later adaptations for economical, social and many other fields, it has been shown to work almost universally. In all of these cases authors discuss the applicability of the Zipf’s law in terms of semantically complex structures. We take this notion a step further and show how this law can work for data analysis, in particular for the sequences of byte data, obtained from various sources. We show that, using the basic chunking methodology, the Zipf’s law can be shown to hold true for many different types of raw sequences of byte data. In particular, the law holds true in all caes for the “middle point” of data, where it is present with a degree of certainty of more than 90 %. We conclude by discussing the implications and potential use cases of these findings.
AB - Zipf’s law have been shown to hold true in many places. From it’s first idea of a statistical phenomenon related to natural language to it’s later adaptations for economical, social and many other fields, it has been shown to work almost universally. In all of these cases authors discuss the applicability of the Zipf’s law in terms of semantically complex structures. We take this notion a step further and show how this law can work for data analysis, in particular for the sequences of byte data, obtained from various sources. We show that, using the basic chunking methodology, the Zipf’s law can be shown to hold true for many different types of raw sequences of byte data. In particular, the law holds true in all caes for the “middle point” of data, where it is present with a degree of certainty of more than 90 %. We conclude by discussing the implications and potential use cases of these findings.
KW - Zipf’s laws
KW - byte data
KW - chunking
KW - frequency analysis
UR - https://applmathjournal.spbu.ru/article/view/19038
UR - https://www.mendeley.com/catalogue/cf07abe2-23c0-3640-a2cc-36612fab6694/
U2 - 10.21638/spbu10.2024.307
DO - 10.21638/spbu10.2024.307
M3 - Article
VL - 20
SP - 391
EP - 403
JO - ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. ПРИКЛАДНАЯ МАТЕМАТИКА. ИНФОРМАТИКА. ПРОЦЕССЫ УПРАВЛЕНИЯ
JF - ВЕСТНИК САНКТ-ПЕТЕРБУРГСКОГО УНИВЕРСИТЕТА. ПРИКЛАДНАЯ МАТЕМАТИКА. ИНФОРМАТИКА. ПРОЦЕССЫ УПРАВЛЕНИЯ
SN - 1811-9905
IS - 3
ER -
ID: 126974789