The aim of the work is to create matrix of feature objects for machine learning problems. The task is to search for data sources, develop a preprocessing algorithm, process statistical community data and form a matrix of feature objects for its further use in the clustering problem. Methods used: data analysis, finding descriptive statistics, methods of Python libraries: Pandas, NumPy. The novelty of this study lies in solving the problem of searching for relevant data of social network communities and developing an algorithm that forms a matrix of feature objects for its further use by researchers. Result: the analysis of services providing statistics of social networks was carried out, a program code was developed that implements the algorithm for generating a matrix of feature objects. The practical significance lies in the processing relevant statistical data and using a matrix of feature objects in machine learning problems.
Original languageEnglish
Title of host publicationData Science and Algorithms in Systems
Subtitle of host publicationProceedings of 6th Computational Methods in Systems and Software 2022
EditorsR. Silhavy, P. Silhavy, Z. Prokopova
PublisherSpringer Nature
Pages990-1001
Number of pages12
Volume2
ISBN (Electronic)978-3-031-21438-7
ISBN (Print)9783031214370
DOIs
StatePublished - 2023
Event6th Computational Methods in Systems and Software 2022 - Прага, Czech Republic
Duration: 13 Oct 202215 Oct 2022
Conference number: 6
https://comesyso.openpublish.eu/

Publication series

NameLNNS
Volume597

Conference

Conference6th Computational Methods in Systems and Software 2022
Abbreviated titleCoMeSySo2022
Country/TerritoryCzech Republic
CityПрага
Period13/10/2215/10/22
Internet address

    Research areas

  • Matplotlib, NumPy, Pandas, Social media marketing, internet marketing, Machine learning, statistical modeling, Mathematical modeling, Data processing

ID: 103176560