High-dimensional descriptor selection and computational QSAR modeling for antitumor activity of ARC-111 analogues based on support vector regression (SVR)

Abstract

To design ARC-111 analogues with improved efficiency, we constructed the QSAR of 22 ARC-111 analogues with RPMI8402 tumor cells. First, the optimized support vector regression (SVR) model based on the literature descriptors and the worst descriptor elimination multi-roundly (WDEM) method had similar generalization as the artificial neural network (ANN) model for the test set. Secondly, seven and 11 more effective descriptors out of 2,923 features were selected by the high-dimensional descriptor selection nonlinearly (HDSN) and WDEM method, and the SVR models (SVR3 and SVR4) with these selected descriptors resulted in better evaluation measures and a more precise predictive power for the test set. The interpretability system of better SVR models was further established. Our analysis offers some useful parameters for designing ARC-111 analogues with enhanced antitumor activity.

Description

Keywords

ARC-111 analogues, QSAR, Support vector regression, High-dimensional descriptor selection nonlinearly (HDSN) method, Worst descriptor elimination multi-roundly (WDEM) method, RPMI8402

Citation