QSAR and Drug Design

David R. Bevan

Department of Biochemistry and Anaerobic Microbiology
Virginia Polytechnic Institute and State University
Blacksburg, VA 24061-0308 USA
E-mail: drbevan@vt.edu



Quantitative structure-activity relationships (QSAR) represent an attempt to correlate structural or property descriptors of compounds with activities. These physicochemical descriptors, which include parameters to account for hydrophobicity, topology, electronic properties, and steric effects, are determined empirically or, more recently, by computational methods. Activities used in QSAR include chemical measurements and biological assays. QSAR currently are being applied in many disciplines, with many pertaining to drug design and environmental risk assessment.

The Early Years

QSAR date back to the 19th century. In 1863, A.F.A. Cros at the University of Strasbourg observed that toxicity of alcohols to mammals increased as the water solubility of the alcohols decreased [1]. In the 1890's, Hans Horst Meyer of the University of Marburg and Charles Ernest Overton of the University of Zurich, working independently, noted that the toxicity of organic compounds depended on their lipophilicity [1,2].

Linear Free Energy Relationships

Little additional development of QSAR occurred until the work of Louis Hammett (1894-1987), who correlated electronic properties of organic acids and bases with their equilibrium constants and reactivity. Consider the dissociation of benzoic acid:

Hammett observed that adding substituents to the aromatic ring of benzoic acid had an orderly and quantitative effect on the dissociation constant. For example,

a nitro group in the meta position increases the dissociation constant, because the nitro group is electron-withdrawing, thereby stabilizing the negative charge that develops. Consider now the effect of a nitro group in the para position:

The equilibrium constant is even larger than for the nitro group in the meta position, indicating even greater electron-withdrawal.

Now consider the case in which an ethyl group is in the para position:

In this case, the dissociation constant is lower than for the unsubstituted compound, indicating that the ethyl group is electron-donating, thereby destabilizing the negative charge that arises upon dissociation.

Hammett also observed that substituents have a similar effect on the dissociation of other organic acids and bases. Consider the dissociation of phenylacetic acids:

Electron-withdrawal by the nitro group increases dissociation, with the effect being less for the meta than for the para substituent, just as was observed with benzoic acid. The electron-donating ethyl group decreases the equilibrium constant, as would be expected.

Data for these equilibria typically are graphed as illustrated below:

Figure 1: Example of a graph for a linear free energy relationship. K0 or K0' represent equilibrium constants for unsubstituted compounds and K or K', for substituted compounds. Values for the abscissa are calculated from the dissociation constants of unsubstituted and substituted benzoic acid. Values for the ordinate are obtained from another organic acid or base with identical patterns of substitution, in this case phenylacetic acid.

Because this relationship is linear, the following equation can be written:

where is the slope of the line. The values for the abscissa in Figure 1 are always those for benzoic acid and are given the symbol, . Therefore, we can write:

, the slope of the line, is a proportionality constant pertaining to a given equilibrium. It relates the effect of substituents on that equilibrium to the effect of those substituents on the benzoic acid equilibrium. That is, if the effect of substituents is proportionally greater than on the benzoic acid equilibrium, then > 1; if the effect is less than on the benzoic acid equilibrium, < 1. By definition, for benzoic acid is equal to 1.

is a descriptor of the substituents. The magnitude of gives the relative strength of the electron-withdrawing or -donating properties of the substituents. is positive if the substituent is electron-withdrawing and negative if it is electron-donating.

These relationships as developed by Hammett are termed linear free energy relationships. Recall the equation relating free energy to an equilibrium constant:

That is, the free energy is proportional to the logarithm of the equilibrium constant. These linear free energy relationships are termed "extrathermodynamic". Although they can be stated in terms of thermodynamic parameters, no thermodynamic principle states that the relationships should be true.

To develop a better understanding of these relationships, it is instructive to consider some values of and . Values of are provided below:

In the aniline and phenol equilibria, the hydrogen ion that is dissociating is one atom removed from the phenyl ring, whereas in the benzoic acid equilibrium it is two atoms removed. Thus, substituents are able to exert a greater effect on the dissociation in aniline and phenol than in benzoic acid and the value of > 1. In phenylacetic and phenylpropionic acids, the hydrogen ion dissociating is three and four atoms removed, respectively, from the phenyl ring. Substituents are able to exert a lesser effect on the equilibrium than on the benzoic acid equilibrium and < 1.

Some illustrative values of for substituents in the meta and para positions are given below:

By definition, for hydrogen is 0. The positive values of for the nitro group indicate that it is electron-withdrawing. In understanding the magnitudes of the values for the nitro group in meta vs. para positions, consider the mechanisms of electron withdrawal or donation. For a nitro group in the meta position, electron-withdrawal is due to an inductive effect produced by the electronegativity of the constituent atoms. If only induction were operative, one would expect the electron-withdrawing effect of a nitro group in the para position to be less than in the meta position. The larger value for a para-substituted nitro group results from the combination of both inductive and resonance effects. For chlorine, the electronegativity of the atom produces an inductive electron-withdrawing effect, with the magnitude of the effect in the para position being less than in the meta position. For chlorine, only the inductive effect is possible. The methoxy group can be electron-donating or -withdrawing, depending on the position of substitution. In the meta position, the electronegativity of the oxygen produces an inductive electron-withdrawing effect. In the para position, only a small inductive effect would be expected. Moreover, an electron-donating resonance effect occurs for the methoxy group in the para position, giving an overall electron-donating effect. Tables of values for numerous substituents have been published [3,4]. In some cases, the sigma values are generally applicable to many different equilibria. In other cases, sigma values have been derived for specific equilibria, which is particularly true when one considers sigma values for ortho substituents.

Applications of the Hammett Equation

Illustrative examples of the application of the Hammett relationship will be presented. The first is the prediction of the pKa of ionization equilibria. Recall the relationship


which for benzoic acid is

Consider the substituted benzoic acid

Given = 0.71 for nitro groups and = -0.13 for methyl groups, we calculate pKa=2.91, which compares favorably with the experimental value of 2.97.

The second example illustrates the applicability of Hammett's electronic descriptors in a QSAR relating the inhibition of bacterial growth by a series of sulfonamides,

where X represents various substituents [5,6]. A QSAR was developed based on the values of the substituents,

where C is the minimum concentration of compound that inhibited growth of E. coli. From this relationship, we see that electron-withdrawing substituents favor inhibition of growth.

Hansch Analysis

QSAR based on Hammett's relationship utilize electronic properties as the descriptors of structures. Difficulties were encountered when investigators attempted to apply Hammett-type relationships to biological systems, indicating that other structural descriptors were necessary.

Robert Muir, a botanist at Pomona College, was studying the biological activity of compounds that resembled indoleacetic acid and phenoxyacetic acid, which function as plant growth regulators. In attempting to correlate the structures of the compounds with their activities, he consulted his colleague in chemistry, Corwin Hansch. Using Hammett sigma parameters to account for the electronic effect of substituents did not lead to meaningful QSAR. However, Hansch recognized the importance of the lipophilicity, expressed as the octanol-water partition coefficient, on biological activity [7]. We now recognize this parameter to provide a measure of the bioavailability of compounds, which will determine, in part, the amount of the compound that gets to the target site.

Relationships were developed to correlate a structural parameter (i.e., lipophilicity) with activity. In some cases, a univariate relationship correlating structure and activity was adequate. The form of the equation is:

where C is the molar concentration of compound that produces a standard response (e.g., LD50, ED50). With other data, it was observed that correlations were improved by combining Hammett's electronic parameters and Hansch's measure of lipophilicity using an equation such as

where is the Hammett substituent parameter and pi is defined analogously to . That is,

In yet other cases, parabolic relationships between biological response and hydrophobicity were observed that could be fit by including a (log P)**2 term in the QSAR. One interpretation to account for this term is that many membranes must be traversed for compounds to get to the target site, and those with greatest hydrophobicity will become localized in the membranes they encounter initially. Thus, an optimum hydrophobicity may be found in some test systems.

QSAR are now developed using a variety of parameters as descriptors of the structural properties of molecules. Hammett sigma values are often used for electronic parameters, but quantum mechanically derived electronic parameters also may be used. Other descriptors to account for the shape, size, lipophilicity, polarizability, and other structural properties also have been devised. A QSAR database has been established at Pomona College that summarizes over 6000 datasets of biological and chemical QSAR.

Drug Design

Researchers have attempted for many years to develop drugs based on QSAR. Easy access to computational resources was not available when these efforts began, so attempts consisted primarily of statistical correlations of structural descriptors with biological activities. However, as access to high-speed computers and graphics workstations became commonplace, this field has evolved into what is often termed rational drug design or computer-assisted drug design.

We will discuss the application of QSAR to drug design, some examples of which relied primarily on statistical correlation and some, on computer-based visualization and modeling. An early example of QSAR in drug design involves a series of 1-(X-phenyl)-3,3-dialkyl triazenes.

These compounds were of interest for their anti-tumor activity, but they also were mutagenic. QSAR was applied to understand how the structure might be modified to reduce the mutagenicity without significantly decreasing the anti-tumor activity. Mutagenic activity was evaluated in the Ames test, and from those data, the following QSAR was developed:

where C is the molar concentration required to give 30 revertants per 10**8 bacteria and is a "through resonance" electronic parameter [8,9]. From the equation, it is seen that factors that favor mutagenicity are increased lipophilicity and electron-donating substituents.

Studies of the anti-tumor activity were done against L1210 leukemia in mice. From the data, the following QSAR was developed:

where C is the molar concentration of compound producing a 40% increase in life span of mice, MR is molar refractivity, which is a measure of molecular volume, and EsR is a steric parameter for the R group [10]. Based on these equations, mutagenicity is more sensitive than anti-tumor activity to the electronic effects of the substituents. Thus, electron-withdrawing substituents were examined, as illustrated in the example below:

By substituting a sulfonamide group at the para position, the anti-tumor activity was reduced 1.2-fold, whereas the mutagenicity was reduced by about 400-fold.

Computer-Assisted Design

Computer-assisted drug design (CADD), also called computer-assisted molecular design (CAMD), represents more recent applications of computers as tools in the drug design process. In considering this topic, it is important to emphasize that computers cannot substitute for a clear understanding of the system being studied. That is, a computer is only an additional tool to gain better insight into the chemistry and biology of the problem at hand.

In most current applications of CADD, attempts are made to find a ligand (the putative drug) that will interact favorably with a receptor that represents the target site. Binding of ligand to the receptor may include hydrophobic, electrostatic, and hydrogen-bonding interactions. In addition, solvation energies of the ligand and receptor site also are important because partial to complete desolvation must occur prior to binding.

This approach to CADD optimizes the fit of a ligand in a receptor site. However, optimum fit in a target site does not guarantee that the desired activity of the drug will be enhanced or that undesired side effects will be diminished. Moreover, this approach does not consider the pharmacokinetics of the drug.

The approach used in CADD is dependent upon the amount of information that is available about the ligand and receptor. Ideally, one would have 3-dimensional structural information for the receptor and the ligand-receptor complex from X-ray diffraction or NMR. The ideal is seldom realized. In the opposite extreme, one may have no experimental data to assist in building models of the ligand and receptor, in which case computational methods must be applied without the constraints that the experimental data would provide.

Based on the information that is available, one can apply either ligand-based or receptor-based molecular design methods. The ligand-based approach is applicable when the structure of the receptor site is unknown, but when a series of compounds have been identified that exert the activity of interest. To be used most effectively, one should have structurally similar compounds with high activity, with no activity, and with a range of intermediate activities. In recognition site mapping, an attempt is made to identify a pharmacophore, which is a template derived from the structures of these compounds. It is represented as a collection of functional groups in three-dimensional space that is complementary to the geometry of the receptor site.

In applying this approach, conformational analysis will be required, the extent of which will be dependent on the flexibility of the compounds under investigation. One strategy is to find the lowest energy conformers of the most rigid compounds and superimpose them. Conformational searching on the more flexible compounds is then done while applying distance constraints derived from the structures of the more rigid compounds. Ultimately, all of the structures are superimposed to generate the pharmacophore. This template may then be used to develop new compounds with functional groups in the desired positions. In applying this strategy, one must recognize that one is assuming that it is the minimum energy conformers that will bind most favorably in the receptor site. In fact, there is no a priori reason to exclude higher energy conformers as the source of activity.

The receptor-based approach to CADD applies when a reliable model of the receptor site is available, as from X-ray diffraction, NMR, or homology modeling. With the availability of the receptor site, the problem is to design ligands that will interact favorably at the site, which is a docking problem.

An Example of CADD: Carbonic anhydrase

Carbonic anhydrase catalyzes the reaction ,the hydration of some aldehydes and ketones, and the hydrolysis of alkyl and aryl esters. It is a zinc-containing enzyme of about 30,000 daltons, and the three-dimensional structure has been characterized by X-ray diffraction. Physiologically, carbonic anhydrase is involved in gastric, urinary, pancreatic, lacrimal, and cerebrospinal secretions. Inhibitors of carbonic anhydrase include aromatic and heterocyclic sulfonamides, and some of these compounds have found application as diuretics.

Both traditional QSAR and computer graphical methods have been applied to the development of sulfonamides and other compounds as inhibitors of carbonic anhydrase. For example, Hansch et al. [11] developed a QSAR based on the binding constants of 29 phenylsulfonamides to the enzyme. The equation that was derived was the following:

where K is the binding constant, I1=1 if X is meta and 0 otherwise, and, I2 = 1 if X is ortho and 0 otherwise.

The negative coefficients of I1 and I2 suggest that they account for unfavorable steric effects when substituents are in the meta or ortho positions. Binding is favored by electron-withdrawing substituents, which is consistent with the hypothesis that the ionized form of -SO2NH2 binds to the zinc in the active site of carbonic anhydrase [12].

Interactive computer graphics also was applied to understand better the interaction of carbonic anhydrase inhibitors with the enzyme as illustrated in Figure 2 [11].

Figure 2: Active site of carbonic anhydrase containing the inhibitor MTS [(4S-trans)-4-(methylamino)-5,6-dihydro-6-methyl-4H-thieno (2,3-B) thiopyran-sulfonamide-7, 7-dioxide]. The image was prepared from the PDB file 1cin.pdb.

The active site is a cavity approximately 12 Angstroms deep with a zinc atom (magenta) near the bottom of the cavity. The active site is divided into a hydrophilic half (blue) and a hydrophobic half (red). In the complex, the inhibitor appears to be bound such that the sulfonamide moiety occupies the fourth coordination site of the zinc atom, with the other three sites being occupied by histidine residues. For subsequent discussion, note that the active site is much larger than is required to accommodate an inhibitor of this size.

Receptor-based drug design incorporates a number of molecular modeling techniques, one of which is docking. The Kuntz research group [13] applied their DOCK program to the identification of compounds that may inhibit carbonic anhydrase. Structures of two of the candidates are shown below.

These molecules are considerably larger than the arylsulfonamides that traditionally are used as carbonic anhydrase inhibitors. In fact, no arylsulfonamides were identified as potential inhibitors in this study. These results probably arise because scoring of candidates was based on the size and shape of the molecules. These large candidates can engage in a greater number of favorable interactions within the large carbonic anhydrase active site than can the smaller arylsulfonamides. More recent versions of DOCK allow scoring based on force fields, which include both van der Waals and electrostatic interactions [14]. These results with DOCK illustrate the potential for programs such as this one to search objectively for ligands than are complementary to receptor sites, thereby assisting researchers in identifying potential drugs than may be considerably different from existing drugs. As yet, the efficacy as drugs of these candidates identified by DOCK has not been demonstrated.

Applications of Other Modeling Techniques

Once potential drugs have been identified by the methods described above, other molecular modeling techniques may then be applied. For example, geometry optimization may be used to "relax" the structures and to identify low energy orientations of drugs in receptor sites. Molecular dynamics may assist in exploring the energy landscape, and free energy simulations can be used to compute the relative binding free energies of a series of putative drugs.


1.   Borman, S. (1990) New QSAR Techniques Eyed for Environmental Assessments. Chem. Eng. News, 68: 20-23.

2.   Lipnick, R.L. (1986) Charles Ernest Overton: Narcosis Studies and a Contribution to General Pharmacology. Trends Pharmacol. Sci., 7: 161-164.

3.   Hansch, C., Leo, A., and Taft, R.W. (1991) A Survey of Hammett Substituent Constants and Resonance and Field Parameters. Chem. Rev., 91: 165-195.

4.   Hansch, C., Leo, A., and Hoekman, D. (1995) Exploring QSAR - Hydrophobic, Electronic, and Steric Constants. American Chemical Society, Washington, D.C.

5.   Seydel, J.K. (1966) Prediction of in Vitro Activity of Sulfonamides, Using Hammett Constants or Spectrophotometric Data of the Basic Amines for Calculation. Mol. Pharmacol., 2: 259-265.

6.   Hansch, C. (1974) Drug Research or the Luck of the Draw. J. Chem. Ed., 51: 360-365.

7.   Hansch, C. (1969) A Quantitative Approach to Biochemical Structure-Activity Relationships. Acct. Chem. Res. 2: 232-239.

8.   Venger, B.H., Hansch, C., Hatheway, G.J., and Amrein, Y.U. (1979) Ames Test of 1-(X-Phenyl)-3,3-dialkyltriazenes. A Quantitative Structure-Activity Study. J. Med. Chem., 22: 473-476.

9.   Hansch, C. (1984-85) The QSAR Paradigm in the Design of Less Toxic Molecules. Drug Metab. Rev., 15: 1279-1294.

10.                     Hatheway, G.J., Hansch, C., Kim, K.H., Milstein, S.R., Schmidt, C.L., Smith, R.N., and Quinn, F.R. (1978) Antitumor 1-(X-Aryl)-3,3-dialkyltriazenes. 1. Quantitative Structure-Activity Relationships vs. L1210 Leukemia in Mice. J. Med. Chem., 21: 563-574.

11.                     Hansch, C., McClarin, J., Klein, T., and Langridge, R. (1985) A Quantitative Structure-Activity Relationship and Molecular Graphics Study of Carbonic Anhydrase Inhibitors. Mol. Pharmacol., 27: 493-498.

12.                     Kumar, K., King, R.W., and Carey, P.R. (1974) Carbonic Anhydrase - Aromatic Sulfonamide Complexes, A Resonance Raman Study. FEBS Lett. 48: 283-287.

13.                     DesJarlais, R.L., Sheridan, R.P., Seibel, G.L., Dixon, J.S., Kuntz, I.D., and Venkataraghavan, R. (1988) Using Shape Complementarity as an Initial Screen in Designing Ligands for a Receptor Binding Site of Known Three-Dimensional Structure. J. Med. Chem., 31: 722-729.

14.                     Meng, E.C., Shoichet, B.K., and Kuntz, I.D. (1992) Automated Docking with Grid-Based Energy Evaluation. J. Comput. Chem., 13: 505-524.