This paper presents a method for constructing a knowledge graph based on patent data, which facilitates the identification of hidden relationships between patents and the organization of information for subsequent analysis. The method involves extracting key textual fields from patent documents and vectorizing them using state-of-the-art transformer models, and building a graph where the nodes represent individual documents, and the edges reflect their semantic proximity. A clustering algorithm is employed to group the patents, ensuring high internal coherence within clusters and reducing the original graph to a compact representation. The resulting clusters are summarized using language models, enabling automatic extraction of significant terms for cluster descriptions. Experimental research conducted on a large corpus of patent data demonstrates the efficacy of the proposed approach, which is confirmed by the relevant partitioning quality metrics. The proposed method improves the interpretation of patent information, facilitating the identification of implicit relationships and structural patterns, which is of great importance for analyzing scientific achievements and managing intellectual property.
Original languageEnglish
Title of host publicationComputational Science and Its Applications – ICCSA 2025 Workshops
Pages219–230
Number of pages12
DOIs
StatePublished - 28 Jun 2025
EventComputational Science and Its Applications – ICCSA 2025 Workshops - Стамбул, Turkey
Duration: 30 Jun 20253 Jul 2025
http://iccsa.org

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
PublisherSpringer Nature
Volume15894
ISSN (Print)0302-9743

Conference

ConferenceComputational Science and Its Applications – ICCSA 2025 Workshops
Abbreviated titleICCSA
Country/TerritoryTurkey
CityСтамбул
Period30/06/253/07/25
Internet address

    Research areas

  • Clustering, Knowledge Graph, Patent Data, Text Vectorization

ID: 138833426