Standard

Predicting the quantum yield of 1O2 generation for pteridines and fluoroquinolones using machine learning. / Чеботаев, Платон Платонович; Буглак, Андрей Андреевич.

In: Physical Chemistry Chemical Physics, Vol. 27, No. 44, 07.10.2025, p. 23722-23740 .

Research output: Contribution to journalReview articlepeer-review

Harvard

APA

Vancouver

Author

BibTeX

@article{c20ce0f2db684227bb4ae49b42e12593,
title = "Predicting the quantum yield of 1O2 generation for pteridines and fluoroquinolones using machine learning",
abstract = "Fluoroquinolones (FQs) are a family of antibiotic drugs well-known for their high photochemical activity: upon UV-vis excitation FQs may produce singlet oxygen and/or lose the fluorine atom. Pterins (Ptrs) are a class of organic photosensitizers with a quantum yield of singlet oxygen generation up to 47% even in the absence of heavy atoms (metals and halogens). The similarities between the electronic absorption spectra of FQs and Ptrs led us to examine their photochemistry in a single study: both FQs and Ptrs are aza-bicyclic compounds. In this paper, we describe and compare machine learning methods suitable for predicting the photosensitizing ability of FQs and Ptrs. We investigated the singlet oxygen generation quantum yield (ΦΔ) for 48 pterins and fluoroquinolones in deuterated water. To build machine learning (ML) models, we selected a dataset containing more than 5000 molecular descriptors, including both quantum chemical and molecular graph theory derivatives, the number of which was reduced using a genetic algorithm (GA). For model development and refinement, we employed multiple linear regression (MLR), support vector regression (SVR), random forest regression (RFR), gradient boosting (GBR) and extreme gradient boosting (XGBoost) techniques. The obtained models demonstrate high predictive performance (Rtrain2 > 0.97, q2 > 0.84), with SVR achieving the best results for the test set (Rtest2 = 0.975), and XGBoost showing the best overall robustness. Interpretability analysis revealed that descriptor relevance varied across models. Descriptors such as conjugated maximum bond length (CMBL) and long-range polarizability (TDB09p) consistently emerged as key contributors to ΦΔ. Interpretability analyses (SHAP and ALE) confirmed their mechanistic relevance, highlighting the complementary contributions of linear and nonlinear models. These results demonstrate the feasibility of unified modeling of structurally diverse photosensitizers and provide actionable insights for the rational design and virtual screening of new compounds for photodynamic therapy.",
author = "Чеботаев, {Платон Платонович} and Буглак, {Андрей Андреевич}",
year = "2025",
month = oct,
day = "7",
doi = "10.1039/d5cp02333e",
language = "English",
volume = "27",
pages = "23722--23740 ",
journal = "Physical Chemistry Chemical Physics",
issn = "1463-9076",
publisher = "Royal Society of Chemistry",
number = "44",

}

RIS

TY - JOUR

T1 - Predicting the quantum yield of 1O2 generation for pteridines and fluoroquinolones using machine learning

AU - Чеботаев, Платон Платонович

AU - Буглак, Андрей Андреевич

PY - 2025/10/7

Y1 - 2025/10/7

N2 - Fluoroquinolones (FQs) are a family of antibiotic drugs well-known for their high photochemical activity: upon UV-vis excitation FQs may produce singlet oxygen and/or lose the fluorine atom. Pterins (Ptrs) are a class of organic photosensitizers with a quantum yield of singlet oxygen generation up to 47% even in the absence of heavy atoms (metals and halogens). The similarities between the electronic absorption spectra of FQs and Ptrs led us to examine their photochemistry in a single study: both FQs and Ptrs are aza-bicyclic compounds. In this paper, we describe and compare machine learning methods suitable for predicting the photosensitizing ability of FQs and Ptrs. We investigated the singlet oxygen generation quantum yield (ΦΔ) for 48 pterins and fluoroquinolones in deuterated water. To build machine learning (ML) models, we selected a dataset containing more than 5000 molecular descriptors, including both quantum chemical and molecular graph theory derivatives, the number of which was reduced using a genetic algorithm (GA). For model development and refinement, we employed multiple linear regression (MLR), support vector regression (SVR), random forest regression (RFR), gradient boosting (GBR) and extreme gradient boosting (XGBoost) techniques. The obtained models demonstrate high predictive performance (Rtrain2 > 0.97, q2 > 0.84), with SVR achieving the best results for the test set (Rtest2 = 0.975), and XGBoost showing the best overall robustness. Interpretability analysis revealed that descriptor relevance varied across models. Descriptors such as conjugated maximum bond length (CMBL) and long-range polarizability (TDB09p) consistently emerged as key contributors to ΦΔ. Interpretability analyses (SHAP and ALE) confirmed their mechanistic relevance, highlighting the complementary contributions of linear and nonlinear models. These results demonstrate the feasibility of unified modeling of structurally diverse photosensitizers and provide actionable insights for the rational design and virtual screening of new compounds for photodynamic therapy.

AB - Fluoroquinolones (FQs) are a family of antibiotic drugs well-known for their high photochemical activity: upon UV-vis excitation FQs may produce singlet oxygen and/or lose the fluorine atom. Pterins (Ptrs) are a class of organic photosensitizers with a quantum yield of singlet oxygen generation up to 47% even in the absence of heavy atoms (metals and halogens). The similarities between the electronic absorption spectra of FQs and Ptrs led us to examine their photochemistry in a single study: both FQs and Ptrs are aza-bicyclic compounds. In this paper, we describe and compare machine learning methods suitable for predicting the photosensitizing ability of FQs and Ptrs. We investigated the singlet oxygen generation quantum yield (ΦΔ) for 48 pterins and fluoroquinolones in deuterated water. To build machine learning (ML) models, we selected a dataset containing more than 5000 molecular descriptors, including both quantum chemical and molecular graph theory derivatives, the number of which was reduced using a genetic algorithm (GA). For model development and refinement, we employed multiple linear regression (MLR), support vector regression (SVR), random forest regression (RFR), gradient boosting (GBR) and extreme gradient boosting (XGBoost) techniques. The obtained models demonstrate high predictive performance (Rtrain2 > 0.97, q2 > 0.84), with SVR achieving the best results for the test set (Rtest2 = 0.975), and XGBoost showing the best overall robustness. Interpretability analysis revealed that descriptor relevance varied across models. Descriptors such as conjugated maximum bond length (CMBL) and long-range polarizability (TDB09p) consistently emerged as key contributors to ΦΔ. Interpretability analyses (SHAP and ALE) confirmed their mechanistic relevance, highlighting the complementary contributions of linear and nonlinear models. These results demonstrate the feasibility of unified modeling of structurally diverse photosensitizers and provide actionable insights for the rational design and virtual screening of new compounds for photodynamic therapy.

UR - https://www.mendeley.com/catalogue/d5a7826f-c7ab-383e-90d7-20a63c2b1062/

U2 - 10.1039/d5cp02333e

DO - 10.1039/d5cp02333e

M3 - Review article

VL - 27

SP - 23722

EP - 23740

JO - Physical Chemistry Chemical Physics

JF - Physical Chemistry Chemical Physics

SN - 1463-9076

IS - 44

ER -

ID: 143781485