Drug Designing – A Review - Interactivepharm

ABOUT AUTHOR:
Muhammed Mujahed
Master’s of Science in Biotechnology.
SRTM University.
[email protected]

INTRODUCTION:
Drug design is an integrated developing discipline which portends an era of ‘tailored drug’. It involves the study of effects of biologically active compounds on the basis of molecular interactions in terms of molecular structure or its physico-chemical properties involved. It studies the processes by which the drug produce their effects, how they react with the protoplasm to elicit a particular pharmacological effect or response how they are modified or detoxified, metabolized or eliminated by the organism.

Disposition of drugs in individual region of biosystems is one of the main factors determining the place , mode and intensity of their action . The biological activity may be “positive” as in drug design or “negative” as in toxicology. Thus drug design involves either total innovation of lead or an optimization of already available lead. These concepts are the building stones up on which the edifice of drug design is built up.

The drug is most commonly an organicsmall molecule that activates or inhibits the function of a biomolecule such as a protein, which in turn results in a therapeutic benefit to the patient. In the most basic sense, drug design involves the design of small molecules that are complementary in shape and charge to the biomolecular target with which they interact and therefore will bind to it. Drug design frequently but not necessarily relies on computer modeling techniques. This type of modeling is often referred to as computer-aided drug design. Finally, drug design that relies on the knowledge of the three-dimensional structure of the biomolecular target is known as structure-based drug design.

REFERENCE ID: PHARMATUTOR-ART-2048

Principles of Drug Design

Lipinski’s Rule of Fives
Lipinski’s rule of fivealso known as the Pfizer’s rule of five or simply the Rule of five (RO5) is a rule of thumb to evaluate druglikeness or determine if a chemical compound with a certain pharmacological or biological activity has properties that would make it a likely orally active drug in humans. The rule was formulated by Christopher A. Lipinski in 1997, based on the observation that most medication drugs are relatively small and lipophilic molecules.

The rule describes molecular properties important for a drug’s pharmacokinetics in the human body, including their absorption, distribution, metabolism, and excretion(“ADME”). However, the rule does not predict if a compound is pharmacologically active.

The rule is important to keep in mind during drug discovery when a pharmacologically active lead structure is optimized step-wise to increase the activity and selectivity of the compound as well as to insure drug-like physicochemical properties are maintained as described by Lipinski’s rule. Candidate drugs that conform to the RO5 tend to have lower attrition rates during clinical trials and hence have an increased chance of reaching the market

Components of the rule :
Lipinski’s rule states that, in general, an orally active drug has no more than one violation of the following criteria:

Not more than 5 hydrogen bond donors (nitrogen or oxygenatoms with one or more hydrogenatoms)
Not more than 10 hydrogen bond acceptors (nitrogen or oxygenatoms)
A molecular mass less than 500 daltons
An octanol-water partition coefficient log P not greater than 5

Note that all numbers are multiples of five, which is the origin of the rule’s name. As with many other rules of thumb, (such as Baldwin’s rules for ring closure), there are many exceptions to Lipinski’s Rule.

Pharmacokinetics of drug design :

• Drugs must be polar – to be soluble in aqueous conditions

to interact with molecular targets

• Drugs must be ‘fatty’ – to cross cell membranes

to avoid rapid excretion

• Drugs must have both hydrophilic and lipophilic characteristics

• Many drugs are weak bases with pKa ’s 6-8.

APPROACHES FOR DRUG DESIGNING:

The various approaches used in drug design include the following..
1) Random screening of synthetic compounds or chemicals and natural products by bioassay procedures.
2) Novel compounds preparation based on the known structures of biologically active, natural substances of plant and animal origin, i.e., lead skeleton.
3) Preparation of structural analogs of lead with increasing biological activity and
4) Application of bioisosteric principle.

The current trend in the drug design is to develop new clinically effective agents through the structural modification of lead nucleus. The lead is a prototype compound that has the desired biological or pharmacological activity but may have many undesirable characteristics, like high toxicity, other biological activity, insolubility or metabolism problems. Such organic leads once identified, are easy to exploit. This process is rather straightforward. The real test resides with the identiication of such lead real test resides with the identification of such lead bioactive positions on the basic skeleton of such leads.

Methods of lead Discovery:

Random Screening:
In the absence of known drugs and other compounds with desired activity, a random screen is a valuable approach. Random screening involves no intellectualization; all compounds are tested in the bioassay without regard to their structures.

The two major classes of materials screened are synthetic chemicals and natural products (microbial, plant, and marine).

Nonrandom (or Targeted or Focused) Screening:

Nonrandom screening, also called targeted or focused screening, is a more narrow approach than is random screening. In this case, compounds having a vague resemblance to weakly active compounds uncovered in a random screen, or compounds containing differentfunctional groups than leads, may be tested selectively. By the late 1970s, the National CancerInstitute’s random screen was modified to a nonrandom screen because of budgetary and manpowerrestrictions. Also, the single tumor screen was changed to a variety of tumor screensbecause it was realized that cancer is not just a single disease.

Drug Metabolism Studies:
During drug metabolism studies metabolites (drug degradation products generated in vivo) that are isolated are screened to determine if the activity observed is derived from the drug candidate or from a metabolite. For example, the anti-inflammatory drug sulindac( Clinoril) is not the active agent; the metabolic reduction product, A”, is responsible for the activity.

Clinical Observations
Sometimes a drug candidate during clinical trials will exhibit more than one pharmacological activity; that is, it may produce a side effect. This compound, then, can be used as a lead (or, with luck, as a drug) for the secondary activity. In 1947 an antihistamine, dimenhydrinate (Dramamine) was tested at the allergy clinic at Johns Hopkins University and was found also to be effective in relieving a patient who suffered from car sickness; a further study proved its effectiveness in the treatment of seasicknessand airsickness. It then became the most widely used drug for the treatment of all forms of motion sickness.

Rational Approaches to Lead Discovery
None of the above approaches to lead discovery involves a major rational component. The lead is just found by screening techniques, as a by-product of drug metabolism studies, or from clinical investigations. Is it possible to design a compound having a particular activity?
Rational approaches to drug design now have become the major routes to lead discovery. The first step is to identify the cause for the disease state. Many diseases, or at least the symptoms of diseases, arise from an imbalance (either excess or deficiency) of particular chemicals in the body, from the invasion of a foreign organism, or from aberrant cell growth. As will be discussed in later chapters, the effects of the imbalance can be corrected by antagonism or agonism of a receptor or by inhibition of a particular enzyme;
foreign organism enzyme inhibition or interference with DNA biosynthesis or function are important approaches to treat diseases arising from microorganisms and aberrant cell growth.

Once the relevant biochemical system is identified, initial lead compounds then become the natural receptor ligands or enzyme substrates. For example, lead compounds for the contraceptives (+)-norgestrel (Ovral) and 17α-ethynylestradiol (Activella) were the steroidal hormones progesterone and 17β-estradiol. Whereas the steroid hormones progesterone and 17β-estradiol show weak and short-lasting effects, the oral contraceptivesnorgestrel and 17α-ethynylestradiol exert strong progestational activity of long duration.

The rational approaches are directed at lead discovery. It is not possible, with much accuracy, to foretell toxicity and side effects, anticipate transport characteristics, or predict the metabolic fate of a drug. Once a lead is identified, its structure can be modified until an effective drug is obtained

OPTIMIZATION OF THE LEAD:
Once your lead compound is Identified , it is easy to exploit. This process is rather straight forward. Various approaches are employed in order to improve the desired pharmacological properties of the lead nucleus. Important amngstthem are,

A Identification of the Active Part:The Pharmacophore
Any drug molecule consists of both, essential and non essential parts. Essential part is important in governing pharmacodynamic (drug-receptor interactions) property while non essential part influences pharmacokinetic features. The relevant groups on a molecule that interact with a receptor are known as bioactive functional groups. They are responsible for the activity. The schematic representation of nature of such bioactive functional groups along with their interatomic distances is known as pharmacophore.

Once such pharmacophore is identified, structural modifications can be done to improve pharmacokinetic properties of the drug.

By determining which are the pharmacophoric groups and which are the auxophoric groups on your lead compound, and of the auxophoric groups, which are interfering with lead compound binding and which are not detrimental to binding, you will knowwhich groups must be excised and which you can retain or modify as needed. One approach in lead modification to help make this determination is to cut away sections of the lead molecule and measure the effects of those modifications on potency. Consider this artificial example of how this might be done. Assume that the addictive analgesics in below structure morphine (R = R_ = H), codeine (R = CH3, R_ = H), and heroin (R = R_ = COCH3) are the lead compounds, and we want to know which groups are pharmacophoric and which are auxophoric.

The morphine family of analgesics binds to the μ opioid receptors. The pharmacophore is known and is shown as the darkened part in above structure. A decrease in potency on removal of a group will suggest that it may have been pharmacophoric, an increase in potency means it was auxophoric and interfering with proper binding, and essentially no change in potency will mean that it is auxophoric but not interfering with binding.

Functional Group Modification
The activity of adrug can be correlated to its structure in terms of the contribution of tis structure in terms of the contribution of its functional group to the lipophilicity, electronic and steric features of the drug skeleton. Hence by selecting proper functional group, one can govern the drug distribution pattern and can avoid the occurrence of side effects.

The importance of functional group modification is demonstrated by below structure. The antibacterial agent, carbutamide ( R = NH2), was found to have an antidiabetic side effect; however, it could not be used as an antidiabetic drug because of its antibacterial activity, which could lead to bacterial resistance The amino group of carbutamide was replaced by a methyl group to give tolbutamide ( R = CH3; Orinase) and in so doing the antibacterial activity was eliminated from the antidiabetic activity.

In some cases, an experienced medicinal chemist knows what functional group will elicit a particular effect. Chlorothiazide( Aldocor) is an antihypertensive agent that has a strong diuretic effect as well. It was known from sulfanilamide work that the sulfonamide side chain can give diuretic (increased urine excretion) activity.

Consequently, diazoxide( Hyperstat) was prepared as an antihypertensive drug without diuretic activity.

Obviously, a relationship exists between the molecular structure of a compound and its activity. This phenomenon was first realized about 135 years ago.

Structure–Activity Relationship studies:
The physiological action of a molecule is a function of its chemical constitution. This observation is the basis of SAR studies. SAR studies usually involve the interpretation of activity in terms of the structural features of a drug molecule. Generalised conclusions then can be made after examining a sufficient number of drug analogs.

An excellent example of this approach came from the development of the sulfonamide antibacterial agents (sulfa drugs). After a number of analogs of the lead compound sulfanilamide ( R= H; AVC) were prepared, clinical trials determined that compounds of this general structure exhibited diuretic and antidiabetic activities as well as antimicrobial activity. Compounds with each type of activity eventually were shown to possess certain structural features in common. On the basis of the biological results of greater than 10,000 compounds, several SAR generalizations were made. Antimicrobial agents shown below have structure

(R = SO2NHR_ or SO3H).

In above structure, (1) the amino and sulfonyl groups on the benzene ring should be para; (2) the anilino amino group may be unsubstituted (as shown) or may have a substituent that is removed

in vivo; (3) replacement of the benzene ring by other ring systems, or the introduction of additional substituents on it, decreases the potency or abolishes the activity; (4)Rmay be any of the alternatives shownbelow, but the potency is reduced in most cases; (5)N_-monosubstitution

(R = SO2NHR_) results in more potent compounds, and the potency increases with heteroaromatic

substitution; and (6) N_-disubstitution (R = SO2NR_2), in general, leads to inactive compounds.

HOMOLOGATION:
The variatioon in the substituent can be used to increase or decrease the polarity, alter the pKa, and change the electronic properties of a molecule. Exploration of homologous series is oe of the most often used method to induce thsese changes in a very gradual manner.A homologous series is a group of compounds that differ by a constant unit, generally a CH2 group.

Usually increasing the length of a saturated carbon side-chain from one (CH₃) to 5 to 9 atoms (pentyl to nonyl) produces an increase in pharmacological effects.Further increase results in a decrease in the activity. This is probably either due to increase in lipohilicity beyond optimum value or decrease in concentration of free drug.

CYCLIZATION OF THE SIDE-CHAIN:
Change in the potency or change in the activity spectra can be brought about by transformation of alkyl side-chain into cyclic analogs. For example, chloropromazine
i. Has more neuroleptic activity than its cyclic analog
ii. Similarly the compound
iii. Has antidepressant activity than neuroleptic activity. While in a compound
iv. The antiemetic activity is greatly enhanced.

Sometimes bridging of the two carbon atoms (secondary cyclization) also leads to an increase in potency. Examples include, thebaine derivatives.

Bioisosterism
The purpose of molecular modification is usually to improve potency, selectivity, duration of action, and reduce toxicity.

Bioisosteresare substituents or groups that have chemical or physical similarities, and which produce broadly similar biological properties. Bioisosterism is an important lead modification approach that has been shown to be useful to attenuate toxicity or to modify the activity of a lead, and may have a significant role in the alteration of pharmakinetics of a lead. There are classical isosteres and nonclassicalisosteres.

The atoms, ions or functional groups in which the peripheral layers of electrons can be considered to be identical are known as classical bioisosters. While non classical bioisosters do naot have the same number of atoms and do not fit the steric and electronic rules of classical isosters, but they do produce a similarity in biological activity.

Classical bio-isosters:

Examples include

Nonclassical bio-isosters:

Examples include

The size, shape, electronic distribution, lipid solubility, water solubility, pKa , chemical reactivity , and hydrogen bonding are the parameters that influence the potency, selectivity and duration of action of drug. Bioisosterism becomes effective because it affects all the above parameters to less or more extent. In the design of bioisosters , the biochemical mode of action may play an important role e.g., aspirin acts by acetylating cyclo-oxygenase enzyme. Isostersof aspirin are inactive because they cannot release the acetyl grou at all ( X=CH₂ ) or at an adequate rate ( X=S, NH )

APPLICATION OF BIOISOSTERISM IN DRUG-DESIGN:

a) An important compound from catecholamine series is phenylephrine in which phenolic hydroxyl group takes part in H-bonding with bioactive site on the receptor. The hydroxyl group can be replaced by other group having ability to undergo H-bonding . Hence alkylsulphonamido derivative of phenylephrine was found to retain activity.

b) A classic example of ring versus noncyclic structure is diethylstilbestrol and

17β-estradiol

Diethylstilbestrol has about the same potency as that of naturally occuring estradiol. He central double bond of diethyl stilbestrol is highly important for the corret orientation of the phenolic and ethyl groups at the receptor site.

c) Bioisosteric analogs in neuroleptic category include

d) Bioisosteric analogs in anti-inflammatory category include

e) Bioisosterism in Antihistaminic agents

f) The non-thiazide category of diuretic agents has been developed by replacing ring SO2 by carbonyl group. e.g., Quinazolinone derivatives.

g) Metoclopramide shares features of both anticholinergic and antidopaminergic agents. IT is I fact used as anticancer agent.

Metoclopramide

h) Pirenzepine, an antimuscarinic agent, possesses structural similarity with tricyclic antdepressantagents.However, it lacks antidepressnt activity due to its poor penetration ability in CNS.Hence, other tricyclic antidepressant agents (e.g., doxepin and trimipramine) are undergoing clinical investigations for antiulcer activity.

Above Table contains a variety of Bioisosters (including classic and non classic bioisosters) which are either clinically used or used as investigational compounds.

Inspite the great success of the classical methods ot drug design, their unpredictability and the tremendous amount of wasted effort expended have necessisated the development of more rational methods with higher predictive capability in an effort to project drug design as a science rather than an art.

The search for a new drug became an risky affair due to the monetary cost , the time involved (from 7 to 10 years) and high rate of failures. These factors have compelled the medicinal chemists to find out new ways of putting the existing drugs to better use.

Types Of Drug Designing:

There are two major types of drug design. The first is referred to as ligand-based drug design and the second, structure-based drug design.

1) Ligand-basedDrug Design:
Ligand-based drug design (or indirect drug design) relies on knowledge of other molecules that bind to the biological target of interest. These other molecules may be used to derive a pharmacophore model that defines the minimum necessary structural characteristics a molecule must possess in order to bind to the target. In other words, a model of the biological target may be built based on the knowledge of what binds to it, and this model in turn may be used to design new molecular entities that interact with the target. Alternatively, a quantitative structure-activity relationship(QSAR), in which a correlation between calculated properties of molecules and their experimentally determined biological activity, may be derived. These QSAR relationships in turn may be used to predict the activity of new analogs.

Quantitative structure–activity relationshipmodels (QSAR models) are regression or classification models used in the chemical and biological sciences and engineering. Like other regression models, QSAR regression models relate a set of “predictor” variables (X) to the potency of the response variable(Y), while classification QSAR models relate the predictor variables to a categorical value of the response variable. In QSAR modeling, the predictors consist of physico-chemical properties or theoretical molecular descriptors of chemicals; the QSAR response-variable could be a biological activity of the chemicals. QSAR models first summarize a supposed relationship between chemical structures and biological activity in a data-set of chemicals. Second QSAR models predict the activities of new chemicals. Related terms include quantitative structure–property relationships (QSPR) when a chemical property is modeled as the response variable. In a nutshell, QSAR and QSPR tries to discern the relationship between molecular descriptors that describe the unique physicochemical properties of the set of compounds of interest with their respective biological activity or chemical property.

For example, biological activity can be expressed quantitatively as the concentration of a substance required to give a certain biological response. Additionally, when physicochemical properties or structures are expressed by numbers, one can find a mathematical relationship, or quantitative structure-activity relationship, between the two. The mathematical expression, if carefully validatedcan then be used to predict the modeled response of other chemical structures, by carefully verifying the Applicability domain (AD).

A QSAR has the form of a mathematical model:

Activity= f (physicochemical properties and/or structural properties) + Error

The error includes model error(bias) and observational variability, that is, the variability in observations even on a correct model.

SAR and the SAR paradox :
The basic assumption for all molecule based hypotheses is that similar molecules have similar activities. This principle is also called Structure–Activity Relationship (SAR). The underlying problem is therefore how to define a small difference on a molecular level, since each kind of activity, e.g. reaction ability, biotransformation ability, solubility, target activity, and so on, might depend on another difference. Good examples were given in the bioisosterism reviews by Patanie/LaVoieand Brown.

In general, one is more interested in finding strong trends. Created hypotheses usually rely on a finite number of chemical data. Thus, the induction principle should be respected to avoid overfitted hypotheses and deriving overfitted and useless interpretations on structural/molecular data.The SAR paradox refers to the fact that it is not the case that all similar molecules have similar activities.

Types

Fragment based (group contribution)
The structure (and hence the activity) of a molecule could be defined as the sum of its individual atoms, but it is better defined for QSAR purposes as the sum of its chemical fragments. Analogously, the “partition coefficient”—a measurement of differential solubility and itself a component of SAR predictions—can be predicted either by atomic methods (known as “XLogP” or “ALogP”) or by chemical fragment methods (known as “CLogP” and other variations). It has been shown that the logP of compound can be determined by the sum of its fragments; fragment-based methods are generally accepted as better predictors than atomic-based methods. Fragmentary logP values have been determined statistically, based on empirical data for known logP values. This method gives mixed results and is generally not trusted to have accuracy of more than ±0.1 units.

Group or Fragment based QSAR is also known as GQSAR. GQSAR allows flexibility to study various molecular fragments of interest in relation to the variation in biological response. The molecular fragments could be substituents at various substitution sites in congeneric set of molecules or could be on the basis of pre-defined chemical rules in case of non-congeneric set. GQSAR also considers cross-terms fragment descriptors, which could be helpful in identification of key fragment interactions in determining variation of activity. Lead discovery using Fragnomics is an emerging paradigm. In this context FB-QSAR proves to be a promising strategy for fragment library design and in fragment-to-lead identification endeavours.

3D-QSAR
3D-QSARrefers to the application of force field calculations requiring three-dimensional structures, e.g. based on protein crystallography or molecule superimposition. It uses computed potentials, e.g. the Lennard-Jones potential, rather than experimental constants and is concerned with the overall molecule rather than a single substituent. It examines the steric fields (shape of the molecule), the hydrophobic regions (water-soluble surfaces), and the electrostatic fields.^[13]

The created data space is then usually reduced by a following feature extraction(see also dimensionality reduction). The following learning method can be any of the already mentioned machine learningmethods, e.g. support vector machines. An alternative approach uses multiple-instance learning by encoding molecules as sets of data instances, each of which represents a possible molecular conformation. A label or response is assigned to each set corresponding to the activity of the molecule, which is assumed to be determined by at least one instance in the set (i.e. some conformation of the molecule).

3-D QSAutogrid-R MPGRS example image
On June 18, 2011 the CoMFA patent has dropped any restriction on the use of GRID and PLS technologies and the RCMD team (rcmd.it) has opened a 3D QSAR web server (3d-qsar.com) based on the 3-D QSAutogrid/R engine. 3-D QSAutogrid/R covers all the main features of CoMFA and GRID/GOLPE with implementation by multiprobe/multiregion variable selection (MPGRS) that improves the simplification of interpretation of the 3-D QSAR map. The methodology is based on the integration of the molecular interaction fields as calculated by AutoGrid and the R statistical environment that can be easily coupled with many free graphical molecular interfaces such as UCSF-Chimera, AutoDock Tools, JMol and others.

Chemical descriptor based :
In this approach, descriptors quantifying various electronic, geometric, or steric properties of a catalyst are computed and used to develop a QSAR. This approach is different from the fragment (or group contribution) approach in that the descriptors are computed for the system as whole rather than from the properties of individual fragments. This approach is different from the 3D-QSAR approach in that the descriptors are computed from scalar quantities (e.g., energies, geometric parameters) rather than from 3D fields. An example of this approach is the QSARs developed for olefin polymerization by half sandwich compounds.

Modeling
In the literature it can be often found that chemists have a preference for partial least squares(PLS) methods, since it applies the feature extraction and induction in one step.

Data mining approach
Computer SAR models typically calculate a relatively large number of features. Because those lack structural interpretation ability, the preprocessing steps face a feature selection problem (i.e., which structural features should be interpreted to determine the structure-activity relationship). Feature selection can be accomplished by visual inspection (qualitative selection by a human); by data mining; or by molecule mining.

A typical data miningbased prediction uses e.g. support vector machines, decision trees, neural networks for inducing a predictive learning model.

Molecule mining approaches, a special case of structured data mining approaches, apply a similarity matrix based prediction or an automatic fragmentation scheme into molecular substructures. Furthermore there exist also approaches using maximum common subgraph searches or graph kernels.

Evaluation of the quality of QSAR models
QSAR modeling produces predictive models derived from application of statistical tools correlating biological activity(including desirable therapeutic effect and undesirable side effects)or physico-chemical properties in QSPR models of chemicals (drugs/toxicants/environmental pollutants) with descriptors representative of molecular structure and/or properties. QSARs are being applied in many disciplines for example risk assessment, toxicity prediction, and regulatory decisions^[22] in addition to drug discovery and lead optimization. Obtaining a good quality QSAR model depends on many factors, such as the quality of input data, the choice of descriptors and statistical methods for modeling and for validation. Any QSAR modeling should ultimately lead to statistically robust and predictive models capable of making accurate and reliable predictions of the modeled response of new compounds.

For validation of QSAR models usually various strategies are adopted:

internal validation or cross-validation;
external validation by splitting the available data set into training set for model development and prediction set for model predictivity check;
blind external validation by application of model on new external data and
data randomization or Y-scrambling for verifying the absence of chance correlation between the response and the modeling descriptors.

The success of any QSAR model depends on accuracy of the input data, selection of appropriate descriptors and statistical tools, and most importantly validation of the developed model. Validation is the process by which the reliability and relevance of a procedure are established for a specific purpose; for QSAR models validation must be mainly for robustness, prediction performances and applicability domain of the models. Leave one-out cross-validation generally leads to an overestimation of predictive capacity, and even with external validation, no one can be sure whether the selection of training and test sets was manipulated to maximize the predictive capacity of the model being published. Different aspects of validation of QSAR models that need attention includes methods of selection of training set compounds, setting training set size and impact of variable selectionfor training set models for determining the quality of prediction. Development of novel validation parameters for judging quality of QSAR models is also important.

Application:

Biological
The biological activity of molecules is usually measured in assays to establish the level of inhibition of particular signal transductionor metabolic pathways. Chemicals can also be biologically active by being toxic. Drug discovery often involves the use of QSAR to identify chemical structures that could have good inhibitory effects on specific targets and have low toxicity(non-specific activity). Of special interest is the prediction of partition coefficient log P, which is an important measure used in identifying “druglikeness” according to Lipinski’s Rule of Five.

While many quantitative structure activity relationship analyses involve the interactions of a family of molecules with an enzyme or receptor binding site, QSAR can also be used to study the interactions between the structural domains of proteins. Protein-protein interactions can be quantitatively analyzed for structural variations resulted from site-directed mutagenesis.

It is part of the machine learning method to reduce the risk for a SAR paradox, especially taking into account that only a finite amount of data is available (see also MVUE). In general all QSAR problems can be divided into a coding and learning.

2) Structure-based Drug Design:
Structure-based drug design (or direct drug design) relies on knowledge of the three dimensional structure of the biological target obtained through methods such as x-ray crystallography or NMR spectroscopy.^[5]If an experimental structure of a target is not available, it may be possible to create a homology modelof the target based on the experimental structure of a related protein. Using the structure of the biological target, candidate drugs that are predicted to bind with high affinity and selectivity to the target may be designed using interactive graphics and the intuition of a medicinal chemist. Alternatively various automated computational procedures may be used to suggest new drug candidates.

As experimental methods such as X-ray crystallography and NMR develop, the amount of information concerning 3D structures of biomolecular targets has increased dramatically. In parallel, information about the structural dynamics and electronic properties about ligands has also increased. This has encouraged the rapid development of the structure-based drug design. Current methods for structure-based drug design can be divided roughly into two categories. The first category is about “finding” ligands for a given receptor, which is usually referred as database searching. In this case, a large number of potential ligand molecules are screened to find those fitting the binding pocket of the receptor. This method is usually referred as ligand-based drug design. The key advantage of database searching is that it saves synthetic effort to obtain new lead compounds. Another category of structure-based drug design methods is about “building” ligands, which is usually referred as receptor-based drug design. In this case, ligand molecules are built up within the constraints of the binding pocket by assembling small pieces in a stepwise manner. These pieces can be either individual atoms or molecular fragments. The key advantage of such a method is that novel structures, not contained in any database, can be suggested.

Active site identification
Active site identification is the first step in this program. It analyzes the protein to find the binding pocket, derives key interaction sites within the binding pocket, and then prepares the necessary data for Ligand fragment link. The basic inputs for this step are the 3D structure of the protein and a pre-docked ligand in PDB format, as well as their atomic properties. Both ligand and protein atoms need to be classified and their atomic properties should be defined, basically, into four atomic types:

hydrophobic atom: All carbons in hydrocarbon chains or in aromatic groups.
H-bond donor: Oxygen and nitrogen atoms bonded to hydrogen atom(s).
H-bond acceptor: Oxygen and sp²or sp hybridize dnitrogen atoms with lone electron pair(s).
Polar atom: Oxygen and nitrogen atoms that are neither H-bond donor nor H-bond acceptor, sulfur, phosphorus, halogen, metal, and carbon atoms bonded to hetero-atom(s).

The space inside the ligand binding region would be studied with virtual probe atoms of the four types above so the chemical environment of all spots in the ligand binding region can be known. Hence we are clear what kind of chemical fragments can be put into their corresponding spots in the ligand binding region of the receptor.

Ligand fragment link

Flow chart for structure-based drug design

When we want to plant “seeds” into different regions defined by the previous section, we need a fragments database to choose fragments from. The term “fragment” is used here to describe the building blocks used in the construction process. The rationale of this algorithm lies in the fact that organic structures can be decomposed into basic chemical fragments. Although the diversity of organic structures is infinite, the number of basic fragments is rather limited.

Before the first fragment, i.e. the seed, is put into the binding pocket, and other fragments can be added one by one, it is useful to identify potential problems. First, the possibility for the fragment combinations is huge. A small perturbation of the previous fragment conformation would cause great difference in the following construction process. At the same time, in order to find the lowest binding energy on the Potential energy surface(PES) between planted fragments and receptor pocket, the scoring function calculation would be done for every step of conformation change of the fragments derived from every type of possible fragments combination. Since this requires a large amount of computation, using different tricks may use less computing power and let the program work more efficiently. When a ligand is inserted into the pocket site of a receptor, groups on the ligand that bind tightly with the receptor should have the highest priority in finding their lowest-energy conformation. This allows us to put several seeds into the program at the same time and optimize the conformation of those seeds that form significant interactions with the receptor, and then connect those seeds into a continuous ligand in a manner that make the rest of the ligand have the lowest energy. The pre-placed seeds ensure high binding affinity and their optimal conformation determines the manner in which the ligand will be built, thus determining the overall structure of the final ligand. This strategy efficiently reduces the calculation burden for fragment construction. On the other hand, it reduces the possibility of the combination of fragments, which reduces the number of possible ligands that can be derived from the program. The two strategies above are widely used in most structure-based drug design programs. They are described as “Grow” and “Link”. The two strategies are always combined in order to make the construction result more reliable.

Rational drug discovery
In contrast to traditional methods of drug discovery, which rely on trial-and-error testing of chemical substances on cultured cells or animals, and matching the apparent effects to treatments, rational drug design begins with a hypothesis that modulation of a specific biological target may have therapeutic value. In order for a biomolecule to be selected as a drug target, two essential pieces of information are required. The first is evidence that modulation of the target will have therapeutic value. This knowledge may come from, for example, disease linkage studies that show an association between mutations in the biological target and certain disease states. The second is that the target is “drugable”. This means that it is capable of binding to a small molecule and that its activity can be modulated by the small molecule.

Once a suitable target has been identified, the target is normally cloned and expressed. The expressed target is then used to establish a screening assay. In addition, the three-dimensional structure of the target may be determined.

The search for small molecules that bind to the target is begun by screening libraries of potential drug compounds. This may be done by using the screening assay (a “wet screen”). In addition, if the structure of the target is available, a virtual screen may be performed of candidate drugs. Ideally the candidate drug compounds should be “drug-like”, that is they should possess properties that are predicted to lead to oral bioavailability, adequate chemical and metabolic stability, and minimal toxic effects. Several methods are available to estimate druglikeness such as Lipinski’s Rule of Five and a range of scoring methods such as Lipophilic efficiency. Several methods for predicting drug metabolism have been proposed in the scientific literature, and a recent example is SPORCalc. Due to the complexity of the drug design process, two terms of interest are still serendipity and bounded rationality. Those challenges are caused by the large chemical space describing potential new drugs without side-effects.

Computer-aided drug design
Computer-aided drug design uses computational chemistry to discover, enhance, or study drugs and related biologically active molecules. The most fundamental goal is to predict whether a given molecule will bind to a target and if so how strongly. Molecular mechanics or molecular dynamics are most often used to predict the conformation of the small molecule and to model conformational changes in the biological target that may occur when the small molecule binds to it. Semi-empirical, ab initio quantum chemistry methods, or density functional theory are often used to provide optimized parameters for the molecular mechanics calculations and also provide an estimate of the electronic properties (electrostatic potential, polarizability, etc.) of the drug candidate that will influence binding affinity.

Molecular mechanics methods may also be used to provide semi-quantitative prediction of the binding affinity. Also, knowledge-based scoring function may be used to provide binding affinity estimates. These methods use linear regression, machine learning, neural nets or other statistical techniques to derive predictive binding affinity equations by fitting experimental affinities to computationally derived interaction energies between the small molecule and the target.^[15]^[16]

Ideally the computational method should be able to predict affinity before a compound is synthesized and hence in theory only one compound needs to be synthesized. The reality however is that present computational methods are imperfect and provide at best only qualitatively accurate estimates of affinity. Therefore in practice it still takes several iterations of design, synthesis, and testing before an optimal molecule is discovered. On the other hand, computational methods have accelerated discovery by reducing the number of iterations required and in addition have often provided more novel small molecule structures.

Drug design with the help of computers may be used at any of the following stages of drug discovery:

hit identification using virtual screening(structure- or ligand-based design)
hit-to-lead optimization of affinity and selectivity (structure-based design, QSAR, etc.)
lead optimization optimization of other pharmaceutical properties while maintaining affinity

Flowchart of a Usual Clustering Analysis for Structure-Based Drug Design
In order to overcome the insufficient prediction of binding affinity calculated by recent scoring functions, the protein-ligand interaction and compound 3D structure information are used to analysis. For structure-based drug design, several post-screening analysis focusing on protein-ligand interaction has been developed for improving enrichment and effectively mining potential candidates:

Consensus scoring

Selecting candidates by voting of multiple scoring functions
May lose the relationship between protein-ligand structural information and scoring criterion

Geometric analysis

Comparing protein-ligand interactions by visually inspecting individual structures
Becoming intractable when the number of complexes to be analyzed increasing

Cluster analysis

Represent and cluster candidates according to protein-ligand 3D information
Needs meaningful representation of protein-ligand interactions.

Examples
A particular example of rational drug design involves the use of three-dimensional information about biomolecules obtained from such techniques as X-ray crystallography and NMR spectroscopy. Computer-aided drug design in particular becomes much more tractable when there is a high-resolution structure of a target protein bound to a potent ligand. This approach to drug discovery is sometimes referred to as structure-based drug design. The first unequivocal example of the application of structure-based drug design leading to an approved drug is the carbonic anhydrase inhibitor dorzolamide, which was approved in 1995.

Conclusion:
Drug design is the creative process of finding new remedies based on the knowledge of a biological target.This review discusses principle of drug design, various approaches of drug design , lead discovery , lead modification & various types of drug discovery. Bioisosterism is an important lead modification approach that has been shown to be useful to attenuate toxicity or to modify the activity of a lead, and may have a significant role in the alteration of pharmakinetics of a lead.The process of drug discovery by laboratory experiments is time consuming and very expensive as compared to computational methods.

References:
1) S.Bandyopadhyay,”Active Site Driven Ligand Design: An Evolutionary Approach”, Journal of Bioinformatics and ComputationalBiology Vol. 3, No. 5 ,pp. 1053–1070,2005.
2) M.I.Ecemis¸ J. H. Wikel, C. Bingham, and Eric Bonabeau” A Drug Candidate Design Environment Using EvolutionaryComputation”, Presented at IEEE Trans Evolutionary Computation,Vol.12,pp.591-603, October-2008.
3) Vogale’s Drug discovery and Evaluation
4) Screening methods in Pharmacology by “N S parmar”
5) Foye’s Medicinal chemistry
6) Medicinal chemistry by “S N Pandeya”
7) pubmed.com
8) Mendely Academic research papers.