Structure Based Activity Prediction

Structure based activity prediction of HIV-1 Reverse Transcriptase inhibitors

We have developed a fast and robust computational method for prediction of antiviral activity in automated de novo design of HIV-1 reverse transcriptase inhibitors. This is a structure based approach that uses a linear relation between activity and interaction energy with discrete orientation sampling and with localized interaction energy terms. The localization allows for the analysis of mutations of the protein target and for the separation of inhibition and a-specific binding to the enzyme. We apply the method to the prediction of pIC50 of HIV-1 reverse transcriptase inhibitors. The model predicts the activity of an arbitrary compound with a q2 of 0.681 and an average absolute error of 0.66 log-value, and it is fast enough to be used in high-throughput computational applications.

Reprints of the paper describing this study can be requested at info@molmo.be or from JMedChem directly. See also our publications for related studies.

final drug design model
Final self-consistent model, training set contoured by measurement density and external reference as square markers. Training set q2 = 0.681, solid lines indicate a +1 or -1 log-value range.



NNRTI binding site
Non Nucleoside Reverse Transcriptase Inhibitor (NNRTI) binding site, with residues involved in the inhibition model. Light gray area indicates the protein side-chains, dark gray the backbone atoms. The thin lines show residues that do not contribute energy terms to the prediction. The molecule in the pocket is a typical potent NNRTI.



validation for over fitting
Test for overfitting of the predictive NNRTI model. The 'best' model found by the genetic algorithm after destruction of correlation of the computed interaction matrix and pIC50 by randomizing the observation column (Training set q2=0.011 after this randomization).




Inhibition and substrate recognition - a computational approach applied to HIV protease

We have developed a computational approach in which an inhibitor's strength is determined from its interaction energy with a limited set of amino acid residues of the inhibited protein. We applied this method to HIV protease. The method uses a consensus structure built from X-ray crystallographic data. All inhibitors are docked into the consensus structure. Given that not every ligand-protein interaction causes inhibition, we implemented a genetic algorithm to determine the relevant set of residues. The algorithm optimizes the q2 between the sum of interaction energies and the observed inhibition constants. The best possible predictive model resulting has a q2 of 0.63. External validation by examining the predictivity for compounds not used in derivation of the model leads to a prediction accuracy between 0.9 and 1.5 log10 unit. Out of 198 residues in the whole protein, the best internally predictive model defines a subset of 20 residues and the best externally predictive model one of 9 residues. These residues are distributed over the subsites of the enzyme. This approach provides insight in which interactions are important for inhibiting HIV protease and it allows for quantitative prediction of inhibitor strength.

Reprints of the paper describing this study can be requested at info@molmo.be. See also our publications for related studies.

protease flowchart
Overview of the steps in the genetic algorithm for feature selection.



protease model
HIV protease coloured according to its subsites. The residues that define a model are rendered as surfaces. The upper enzyme shows the 20 residues from the best internally predictive model. The 9 residues from the best externally predictive model are depicted in the enzyme below.