The simplified molecular input-line entry system (SMILES) and IUPAC International Chemical Identifier (InChI) were examined as representations of the molecular structure for quantitative structure-activity relationships (QSAR), which can be used to predict the inhibitory activity of styrylquinoline derivatives against the human immunodeficiency virus type 1 (HIV-1). Optimal SMILES-based descriptors give a best model with n=26, r2= 0.6330, q2=0.5812, s=0.502, F=41 for the training set and n=10, r2=0.7493, =0.6235, =0.537, s=0.541, F=24 for the validation set. Optimal InChI-based descriptors give a best model with n=26, r2=0.8673, q2=0.8456, s=0.302, F=157 for the training set and n=10, r2= 0.8562, =0.7715, =0.819, s=0.329, F= 48 for the validation set. Thus, the InChI-based model is preferable. The described SMILES-based and InChI-based approaches have been checked with five random splits into the training and test sets. InChI-based optimal descriptors give more robust prediction for anti-HIV-1 activity (pEC50) than analogical descriptors calculated with SMILES.
- Anti-HIV-1 inhibitory activity
- Optimal descriptor
ASJC Scopus subject areas
- Molecular Medicine