Full Paper Computational Investigations on Inhibitors of Mycobacterium tuberculosis Shikimate Kinase: Machine Learning, Docking, Molecular Dynamics and Free Energy Calculations Santos, Anderson J. A. B. dos Netz, Paulo A. Abstract in English: Shikimate kinase emerges as an intriguing macromolecular target for the development of novel pharmaceutical agents for the treatment of tuberculosis. This study aimed to develop a neural network (NN) for the discovery of potential inhibitors of Mycobacterium tuberculosis shikimate kinase and to conduct molecular docking and molecular dynamics (MD) simulations. The NN model pointed out to a set of 810 molecules with anti-tuberculosis activity, wherein 86% of this set also demonstrated positive outcomes according to docking calculations. Among these, 54 molecules exhibited a docking score ranging from -9 to -9.8 kcal mol-1. Subsequently, a subset of molecules was selected for molecular dynamics studies and molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) calculations. Furthermore, it was possible to assess that the dataset with higher affinity shared a similar electronic profile, as evidenced by the analysis of global descriptors (electronic chemical potential, hardness, and electrophilicity). The molecules displaying the lowest Gibbs free energy (∆G)binding values, therefore the highest affinity, were identified as CHEMBL1229147, CHEMBL4095667, and CHEMBL120640. |
Full Paper Deep Reinforcement Learning and Structure-Based Approaches in the de novo Design of a New Potential Inhibitor of F13 Protein from Monkeypox Virus Alencar Filho, Edilson B. Oliveira Neto, Rosalvo F. Santos, Vanessa C. Ferreira, Allysson L. S. Abstract in English: Monkeypox (MPOX) is a zoonotic infectious disease caused by the monkeypox virus (MPXV) and has recently emerged as a significant concern for public health organizations globally. In 2022, the World Health Organization (WHO) reported thousands of laboratory confirmed cases, mobilizing the scientific community to control this phenomenon due to its emergency nature. Tecovirimat (TPOXX), a drug primarily recognized for the treatment of smallpox, has also been recommended for managing MPOX. It works by inhibiting the viral F13 protein (VP37), a critical component in the replication cycle of the virus. Some issues related to the possibility of drug resistance by the virus, the intrinsic chemical complexity of this molecule and the limited availability of therapeutic alternatives highlight the urgent need to explore and identify new effective compounds. In this paper, we propose the combination of modern machine learning techniques (deep reinforcement learning) with structure-based drug design (SBDD) approaches (molecular docking and dynamics) in the de novo design of molecular scaffolds with affinity for the F13 protein, lower structural complexity than TPOXX and easy synthetic accessibility, contributing to efforts in the search for therapeutic alternatives for MPOX. |
Full Paper Computational Modeling and Biological Evaluation of Benzophenone Derivatives as Antileishmanial Agents Farias, Bárbara F. Ferreira, Miller S. Miranda, Daniel O. Nunes, Tayná R. Pereira, Natália F. Espuri, Patrícia F. Januario, Jaqueline P. Colombo, Fábio A. Marques, Marcos J. Zanin, João L. B. Soares, Marisi G. Souza, Thiago B. de Carvalho, Diogo T. Chagas-Paula, Daniela A. Dias, Danielle F. Abstract in English: Leishmaniasis is a neglected tropical disease with limited therapeutic options characterized by high toxicity, adverse side effects, and growing resistance to existing treatments. In this study, machine learning (ML) methods were employed to design and evaluate benzophenone and xanthone derivatives as potential antileishmanial agents. A dataset of 73 compounds was curated, and Quantitative Structure-Activity Relationship (QSAR) models were developed using artificial neural networks (ANN), Random Forest (RF), and J48 decision tree classifiers. The ANN model achieved the highest accuracy (86.2%) in predicting antileishmanial activity, validated through in vitro assays. Among 14 newly synthesized benzophenones, compounds 5 and 7 demonstrated significant biological activity with inhibitory concentration 50 (IC50) values of 10.19 and 14.35 μM, respectively, and favorable selectivity indices compared to reference drugs pentamidine and amphotericin B. Structural analysis highlighted the importance of thiosemicarbazone and 4-methyl groups, alongside electronegative substituents at position 11, in enhancing activity. This study underscores the potential of computational tools to streamline the discovery of novel, effective, and selective antileishmanial agents. |
Full Paper Drug Repurposing for Trypanosomiasis: Using Machine Learning Models and Polypharmacology to Identify Multitarget Candidates Domingues, Karime Zeraik A. Cobre, Alexandre de F. Fachi, Mariana M. Lazo, Raul Edison L. Ferreira, Luana M. Pontarolo, Roberto Abstract in English: Chagas disease and African sleeping sickness are neglected tropical diseases (NTD) caused by Trypanosoma parasites, with current treatments facing challenges like toxicity and resistance. This study integrates machine learning and Quantitative Structure-Activity Relationship (QSAR) models to repurpose Food and Drug Administration (FDA)-approved drugs as potential treatments for these diseases. A dataset of 21,608 compounds with inhibitory activity against Trypanosoma cruzi and Trypanosoma brucei was analyzed using PubChem fingerprints. Random Forest and Extreme Gradient Boosting models were trained and applied to screen the ZINC-22 database for new therapeutic options. Posaconazole was predicted as the top candidate for multitarget activity against both Trypanosoma species, followed by pentamidine, a drug already approved for sleeping sickness. Additionally, 40 other drug candidates were identified by the models (pIC50 > 6 and coefficient of variation < 0.05), mainly antineoplastics (32%) and antifungals (19%). This approach demonstrates the potential of computational techniques in accelerating the discovery of drug candidates for neglected infectious diseases. |
Full Paper Discrimination between COVID-19 Positive and Negative Blood Sera Using an Unmodified Disposable Impedimetric Sensor and Multivariate Analysis Cruz, Ingrid G. B. L. Sales, Flávia R. P. Fragoso, Wallace D. Castellano, Lúcio R. C. Beltrão, Fabyan E. L. Cardoso, Talita N. Oliveira, Maísa S. de Lemos, Sherlan G. Abstract in English: The present study introduces a direct approach for classifying blood serum samples as either positive or negative for coronavirus disease (COVID-19) by associating the electrochemical impedance data of the sample with multivariate analysis. The hypothesis is that the systematic alterations in blood composition resulting from a severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection give rise to a distinct impedance spectrum when infected serum is subjected to analysis. A total of 201 serum samples were analyzed using the gold standard method, reverse transcription-polymerase chain reaction (RT-PCR), which served to train and validate the classification models. Two variations of discriminant analysis (partial least squares discriminant analysis (PLS-DA) and principal component analysis-discriminant analysis (PCA-DA)) and a one-class modeling approach (soft independent modeling of class analogies (SIMCA)) were used to classify impedance data in different formats (as complex or real numbers). PCA-DA applied to imaginary impedance spectra was found to be the most effective strategy, achieving sensitivity, specificity, and precision of 94, 94, and 91%, respectively, with classification error rates as low as 6%. These findings are encouraging and could facilitate the development of an inexpensive and reliable screening method for COVID-19. |
Full Paper Machine Learning Prediction of the Most Intense Peak of the Absorption Spectra of Organic Molecules Souza, Rubens C. Duarte, Julio C. Goldschmidt, Ronaldo R. Borges Jr., Itamar Abstract in English: Accurate knowledge of electronic molecular properties of excited states is fundamental for understanding the behavior of functional materials for organic electronics and sensors. In this work, we focus on determining the properties of the most intense peak in the electronic absorption spectra of organic molecules. For this purpose, we employed the quantum chemistry QM-symex dataset, which has approximately 173,000 organic molecules and time-dependent density functional theory (TD-DFT) data of the first ten electronic absorption transitions. Each one is identified by its Cartesian coordinates. From data in the original QM-symex, we built a new dataset named QM symex-modif that contains molecules in simplified molecular input line entry system (SMILES) format and properties related to the main electronic transition. We then employed twenty machine learning (ML) algorithms to investigate oscillator strengths, excitation energies, transition orbitals, and the highest occupied molecular orbitals (HOMOs). As inputs for the ML algorithms, we used several chemical descriptors for each molecule generated in the RDKit tool employing the corresponding SMILES format. The generated input descriptors significantly improved the accuracy of the ML predictions for these key photophysical properties. Very good mean absolute errors (MAEs) were obtained for the test set composed of 45,056 molecules, namely, an MAE of 0.035 for oscillator strengths, 0.09 eV for excitation energies, 1.24 and 0.62 for the initial and final transition molecular orbital (MO) numbers (i.e., for each molecule, their position in the MO listing), respectively, and 0.014 for HOMO numbers, with coefficient of determination (R2) values consistently exceeding 0.94, thus demonstrating the accuracy of the models. Additionally, a Shapley additive explanation (SHAP) analysis was carried out to evaluate the importance of the input parameters for the investigated ML models. We found several interesting relationships involving the input parameters. In particular, molecular weight holds significant importance in our ML models for determining the target HOMO numbers and the transition orbitals. |
Full Paper Machine Learning to Treat Data for the Design and Improvement of Electrochemical Sensors: Application for a Cancer Biomarker Redín, Gisela Ibáñez Braz, Daniel C. Gonçalves, Débora Oliveira Jr., Osvaldo N. Abstract in English: Label-free immunosensors based on screen-printed carbon electrodes offer a promising platform for the detection of cancer biomarkers. Herein, we explore the use of machine learning techniques to improve the performance of these immunosensors. We evaluate the influence of various redox probes on the analytical response in detecting the cancer biomarker protein p53. Ascorbic acid (AA) was found as the optimal redox probe, exhibiting a sensitivity of 0.26 ng mL-1, attributed to its strong affinity to proteins through hydrogen bonds and electrostatic interactions. We also extracted analytical information from the voltammograms, such as shifts in peak potential and changes in peak width, to construct datasets for supervised machine learning. Using different algorithms including logistic regression, linear discriminant analysis, K-nearest neighbor, Gaussian Naive-Bayes, decision trees, and support vector machine, we identified positive samples spiked with p53 in artificial urine and saliva samples. Through a comparison of immunosensors with distinct molecular architectures, we determined the critical role of redox probe selection, which proves to be more significant than modifying the working electrodes in determining performance. Furthermore, immunosensors with inferior inherent detection ability can achieve comparable performance to those with superior analytical characteristics when feature selection and machine learning algorithms are applied to the voltammograms. These findings illustrate the significance of extracting additional information from differential pulse voltammograms beyond peak current intensity. Furthermore, using machine learning techniques allows one to design biosensors capable of distinguishing biomarkers even in complex samples. |
Short Report Effect of the Alkyl Side Chain of Antitrypanosomal Cinnamate, p-Coumarate, and Ferulate n-Alkyl Esters Using Multivariate Analysis and Computer-Aided Drug Design Silva, Matheus L. Baldim, João L. Costa-Silva, Thais A. Amaral, Maiara Romanelli, Maiara M. Levatti, Erica V. C. Tempone, Andre G. Lago, João Henrique G. Abstract in English: In the present work, three series of cinnamic (1), p-coumaric (2) and ferulic (3) esters containing different side-chains such as ethyl (1a-3a), n-propyl (1b-3b), n-butyl (1c-3c), n-pentyl (1d-3d), n-hexyl (1e-3e), and n-heptyl (1f-3f) were prepared, tested for activity against trypomastigote forms of the parasite Trypanosoma cruzi and toxicity against NCTC cells. Obtained results indicated that the presence of p-coumaric or ferulic moieties associated with C4-C7 linear side-chains play an important role in the bioactivity against T. cruzi since compounds 2c-2f and 3d-3f were found to be the most active derivatives with a half maximal effective concentration (EC50) value ranging from 12.8 to 1.7 μM, superior to that determined for the positive control benznidazole (EC50 = 16.4 μM). Additionally, machine learning and multivariate statistical analyses identified molecular features correlated with biological activity, emphasizing the importance of side-chain length and lipophilicity, highlighting the significance of the molecular structure of phenylpropanoid derivatives in the activity against T. cruzi. |