Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
The composition of dense neural networks and formal grammars for secondary structure analysis. / Grigorev, Semyon; Lunina, Polina.
BIOINFORMATICS 2019 - 10th International Conference on Bioinformatics Models, Methods and Algorithms, Proceedings; Part of 12th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2019. ed. / Elisabetta De Maria; Hugo Gamboa; Ana Fred. SciTePress, 2019. p. 234-241.Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
}
TY - GEN
T1 - The composition of dense neural networks and formal grammars for secondary structure analysis
AU - Grigorev, Semyon
AU - Lunina, Polina
PY - 2019/1/1
Y1 - 2019/1/1
N2 - We propose a way to combine formal grammars and artificial neural networks for biological sequences processing. Formal grammars encode the secondary structure of the sequence and neural networks deal with mutations and noise. In contrast to the classical way, when probabilistic grammars are used for secondary structure modeling, we propose to use arbitrary (not probabilistic) grammars which simplifies grammar creation. Instead of modeling the structure of the whole sequence, we create a grammar which only describes features of the secondary structure. Then we use undirected matrix-based parsing to extract features: the fact that some substring can be derived from some nonterminal is a feature. After that, we use a dense neural network to process features. In this paper, we describe in details all the parts of our receipt: a grammar, parsing algorithm, and network architecture. We discuss possible improvements and future work. Finally, we provide the results of tRNA and 16s rRNA processing which shows the applicability of our idea to real problems.
AB - We propose a way to combine formal grammars and artificial neural networks for biological sequences processing. Formal grammars encode the secondary structure of the sequence and neural networks deal with mutations and noise. In contrast to the classical way, when probabilistic grammars are used for secondary structure modeling, we propose to use arbitrary (not probabilistic) grammars which simplifies grammar creation. Instead of modeling the structure of the whole sequence, we create a grammar which only describes features of the secondary structure. Then we use undirected matrix-based parsing to extract features: the fact that some substring can be derived from some nonterminal is a feature. After that, we use a dense neural network to process features. In this paper, we describe in details all the parts of our receipt: a grammar, parsing algorithm, and network architecture. We discuss possible improvements and future work. Finally, we provide the results of tRNA and 16s rRNA processing which shows the applicability of our idea to real problems.
KW - Dense Neural Network
KW - DNN
KW - Formal Grammars
KW - Genomic Sequences
KW - Machine Learning
KW - Parsing
KW - Proteomic Sequences
KW - Secondary Structure
UR - http://www.scopus.com/inward/record.url?scp=85064687958&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85064687958
SP - 234
EP - 241
BT - BIOINFORMATICS 2019 - 10th International Conference on Bioinformatics Models, Methods and Algorithms, Proceedings; Part of 12th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2019
A2 - De Maria, Elisabetta
A2 - Gamboa, Hugo
A2 - Fred, Ana
PB - SciTePress
T2 - 10th International Conference on Bioinformatics Models, Methods and Algorithms, BIOINFORMATICS 2019 - Part of 12th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2019
Y2 - 22 February 2019 through 24 February 2019
ER -
ID: 48534701