We propose a way to combine formal grammars and artificial neural networks for biological sequences processing. Formal grammars encode the secondary structure of the sequence and neural networks deal with mutations and noise. In contrast to the classical way, when probabilistic grammars are used for secondary structure modeling, we propose to use arbitrary (not probabilistic) grammars which simplifies grammar creation. Instead of modeling the structure of the whole sequence, we create a grammar which only describes features of the secondary structure. Then we use undirected matrix-based parsing to extract features: the fact that some substring can be derived from some nonterminal is a feature. After that, we use a dense neural network to process features. In this paper, we describe in details all the parts of our receipt: a grammar, parsing algorithm, and network architecture. We discuss possible improvements and future work. Finally, we provide the results of tRNA and 16s rRNA processing which shows the applicability of our idea to real problems.

Original languageEnglish
Title of host publicationBIOINFORMATICS 2019 - 10th International Conference on Bioinformatics Models, Methods and Algorithms, Proceedings; Part of 12th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2019
EditorsElisabetta De Maria, Hugo Gamboa, Ana Fred
PublisherSciTePress
Pages234-241
Number of pages8
ISBN (Electronic)9789897583537
StatePublished - 1 Jan 2019
Event10th International Conference on Bioinformatics Models, Methods and Algorithms, BIOINFORMATICS 2019 - Part of 12th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2019 - Prague, Czech Republic
Duration: 22 Feb 201924 Feb 2019

Conference

Conference10th International Conference on Bioinformatics Models, Methods and Algorithms, BIOINFORMATICS 2019 - Part of 12th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2019
Country/TerritoryCzech Republic
CityPrague
Period22/02/1924/02/19

    Scopus subject areas

  • Biomedical Engineering
  • Electrical and Electronic Engineering

    Research areas

  • Dense Neural Network, DNN, Formal Grammars, Genomic Sequences, Machine Learning, Parsing, Proteomic Sequences, Secondary Structure

ID: 48534701