We use cookies to improve your experience. By continuing to browse this site, you accept our cookie policy.×

The importance of triaging in determining the quality of output from high-throughput screening

    Philip Jones

    *Author for correspondence:

    E-mail Address: p.s.jones@dundee.ac.uk

    European Screening Centre Newhouse, University of Dundee, Biocity Scotland, Bo'Ness Road, Newhouse, Lanarkshire ML1 5UH, UK

    ,
    Stuart McElroy

    European Screening Centre Newhouse, University of Dundee, Biocity Scotland, Bo'Ness Road, Newhouse, Lanarkshire ML1 5UH, UK

    ,
    Angus Morrison

    European Screening Centre Newhouse, University of Dundee, Biocity Scotland, Bo'Ness Road, Newhouse, Lanarkshire ML1 5UH, UK

    &
    Andrew Pannifer

    European Screening Centre Newhouse, University of Dundee, Biocity Scotland, Bo'Ness Road, Newhouse, Lanarkshire ML1 5UH, UK

    Published Online:https://doi.org/10.4155/fmc.15.121

    “Quality means doing it right when no one is looking”. ― Henry Ford

    The discovery of new medicines is a complex and expensive process. For small-molecule medicines using a high-quality starting point can have a significant impact on the outcome of the discovery efforts both in terms of speed and quality. High-throughput screening (HTS) is a mature and effective method for identifying chemical starting points; however, due to the need for infrastructure, in terms of robotics and compound libraries, historically this approach has been concentrated in major pharma groups. In addition, interpretation of the output and the design of the subsequent steps to validate that output are also vital in avoiding increasingly well-understood traps that may be overlooked by the unwary. Within major pharma HTS is a mature discipline that has been embedded there since the late 1980s; however, in the wider drug discovery community there is a much lower awareness of the issues associated with this approach. There has been significant growth in academic drug discovery in recent years and the availability of HTS has been identified as a gap. Wider access to HTS expertise and facilities will greatly increase the opportunity to leverage the research that is being carried out in academia and to develop small-molecule medicines that will ultimately benefit patients.

    The value of high-quality leads

    The discovery of new medicines is a complex and expensive process. The annual output of new medicines has varied significantly over the past 20 years from a peak in 1996 to a trough in 2007 and a 17-year high in 2014 [1]. These statistics have been used to justify massive changes in investment in drug discovery, both in positive and negative directions, over that period despite the fact that, particularly for early stage discovery, there can be a 10–15-year lag before changes in the discovery strategy impact on the output of approved medicines. What appears to be incontrovertible is that the cost per compound launched has risen consistently to the point where estimates now run at greater than US$2 billion per medicine [2]. In 2010, Paul et al. published [3] an analysis of the drivers of the cost. The study revealed that the most significant driver in the discovery phase was the cost of lead optimization. The effectiveness and efficiency of a lead optimization program is critically dependent on the quality of the lead compounds upon which it is initiated. Following on from this, a program based upon a high-quality lead will most likely progress to a clinical candidate more rapidly and result in a clinical program that is less likely to fail for compound-related reasons. Conversely, lead optimization programs initiated on poor quality leads (usually because of a strong biological rationale or a high competitive threat) will likely spend longer in lead optimization and if they do progress to the clinical phase will require more effort to resolve remaining issues, for example, with pharmaceutics. Therefore, the availability of high-quality leads has the opportunity to significantly impact productivity. In this context a high-quality lead balances activity, ligand efficiency, dependency on toxicophoric elements, consideration of metabolic sensitivity and, critically for target based approaches, evidence of target engagement.

    Lead generation using HTS

    Multiple approaches exist to generate lead compounds. Classically, natural products, endogenous ligands and substrates have provided a rich source of starting points to modify and optimize and to eventually provide landmark medicines. More recently, fragment-based approaches have produced exciting results; however, the technical ability to rapidly and reliably screen increasingly large numbers of compounds, has led to the rise of HTS as a central method [4].

    The ability of HTS to generate high-quality output is critically dependent on at least four factors:

    • Infrastructure: including the facilities to store, access and distribute a compound library or libraries under carefully controlled environmental conditions; liquid handling so that compounds can be distributed in screening plates flexibly and sparingly to allow appropriate screening experiments to be carried out; robotics to allow plate-based screens to be carried out at high throughput; reader technology to record the outcome of screens. A broad range of equipment is required to accommodate the range of validated high-throughput technologies currently available. In addition, access to biophysical techniques is particularly valuable for confirming target engagement and a robust IT infrastructure is required to ensure data can be flexibly processed and stored with high degree of integrity.

    • Compounds: a high-quality compound collection of sufficient size is required. Quality of the individual component compounds in the library is measured in terms of sample integrity (purity and identity), reasonable physicochemical properties and the exclusion of undesirable functionality. For diversity-based screening the chemical space sampled by the library should be broad although, more importantly, and more difficult to assess, the biological space interrogated should be extensive. Ideally, related compounds should be present in the collection so that any hit is not a singleton (see below). Lead-like properties are important to provide head-room for typical optimization activities in terms of molecular weight and particularly lipophilicity increases. Molecules with known drug liabilities should be avoided, for example, toxicophores but also screening liabilities, for example, presence of frequent hitting, interference, PAINS (pan assay interference compounds) functionality. However, generalizing here requires caution – see below.

    • Assays: robust, miniaturized assays that provide reproducible results are a prerequisite. It must be recognized that the requirements to run an HTS should not compromise the ability to identify the optimal compounds, for example, enzymatic assays being run at steady state and with careful consideration of the substrate and product concentrations. The target and assay dependency on known interference mechanisms (e.g., redox cycling, fluorescence) should be assessed and mitigated or circumvented using careful assay development prior to screening.

    • Triage expertise: it must be recognized that the production of quality starting points from the output of an HTS requires a selection of further assays to be carried out, creating what is known as the screening cascade [5]. Reasons for ‘irrelevant positives’ observed in the initial screen must be identified and experiments devised to eliminate them. Mechanisms for irrelevant positives include compounds that interfere with the assay technology, for example, absorbing or fluorescing at the assay wavelengths or compounds that interfere with the assay components in a pharmacologically irrelevant manner, for example, aggregators or denaturants. Ideally, a robust series of tests that identify genuine target engagement and allow a positive selection mechanism should be employed. However, methods for identifying false positives can also be used to deprioritize compounds with a high probability of acting in a nonspecific manner. Techniques for identifying false negatives can also be implemented and the compounds identified reassessed. The appropriate application of suitable methods requires considerable expertise and experience.

    Irrelevant positives

    In a traditional target-based lead optimization project modifications are made to a lead compound resulting in changes in biological activity, which are, in turn, interpreted as changes in the binding of the compounds to the target. While changes in binding mode are well known [6]; this central assumption can be generally relied upon within a chemical series with a known mechanism of binding. In the lead discovery space where HTS is being used to screen a range of compounds specifically designed to interact with a broad range of biological targets it is not possible to conclude that a response in an initial assay reflects binding to the biological target. Further evidence is required to confirm this conclusion. Many terms have been used to describe compounds that do not bind in a useful manner, for example, frequent hitters, PAINS, false positives and these occur for a range of reasons. Most high-throughput biological assays use a surrogate marker or ‘label’ to measure a response. The mechanisms by which the response (most frequently light-based) is generated can be direct or involve a multitude of components. Typically, all these components can be affected by test samples within a large library of chemically diverse compounds. For example, Alphascreen is a commonly used assay technology that utilizes the generation of singlet oxygen species – if the test compound interacts with singlet oxygen rather than the relevant biological target then potentially the assay will generate an irrelevant result (not a false positive as the effect is genuine but irrelevant). Alternatively, some molecules demonstrate the ability to form aggregates which can interact with components of the assay, notably the target protein, to inhibit their function. Therefore, modulation of the response will be observed, although once again, the compound responsible will not be a suitable starting point for a drug discovery program. Redox compounds are a notorious class of interference compound [7], since they promote oxidation, particularly of cysteine residues. For processes that rely on cysteines in the active site, this will clearly be a problem but structural cysteines can also be affected. Metal chelating compounds potentially will affect metal dependent enzymes although this may not be a reason to eliminate the compound per se. More insidiously metal contaminants in screening samples, a byproduct of the synthetic procedures, can interfere with the function of an array of biological targets [8].

    Mechanisms for interference have become better understood over recent years and it is highly likely that new mechanisms will be observed in the future. In attempts to rapidly identify and eliminate irrelevant compounds from progression structural filters have been developed, for example, identifying structural motifs, which inhibit structurally and functionally different targets, but where the activity is measured using the same technology, for example, PAINS [9]. Important messages about interference compounds have emerged as a result of this work, highlighting to a wider community potential risks and traps. It is important, however, to recognize that these risks are context dependent, for example, light-based interference will only occur if there is an overlap between the wavelengths of absorbance/emission of the ligand and the assay technology. Conversely, inclusion of compounds in screening collections is often justified on the basis they are known drugs. Some antifungal azoles fall into this class but at the high concentrations utilized in screening have been shown to be aggregators [10]. 2-aminothiazoles are also an interesting class of compounds. This substructure occurs in several known drugs although it is a motif that is often regarded with suspicion. Recent work identifies ‘promiscuous amino-thiazoles’ as demonstrated by SPR to bind to multiple diverse targets [11]. Finally, ene-rhodanines are one of the most infamous of PAINS substructures. Multiple modes of action have been identified (e.g., photoactivity, alkylation) and thought to be at least partially responsible for observing activity in a broad range of assays. A recent paper by Schofield et al. [12] identifies a further mechanism whereby hydrolysis of the ene-rhodanine occurs in the assay and the resultant carboxy thiol generates a specific interaction with the target as demonstrated by X-ray crystallography. It is therefore important to consider all aspects of the screening process and have a well thought out screening cascade when identifying hits, as there is also a significant danger of eliminating tractable chemotypes based on pre-existing prejudices.

    Triage design

    A high-quality hit series can be defined as a selection of related compounds with proven chemical structure, with activity at the target of interest, emerging structure–activity relationships (SAR) from a series of analogs, and with favorable properties for further optimization.

    As indicated above, there are many mechanisms by which a positive signal can be observed in a primary screen but which are not indicative of an interesting compound. An effective way to eliminate these irrelevant positives is to design a series of experiments tailored to the specific program. This requires access to a series of technologies and expertise in several disciplines including biochemistry, biophysics, cheminformatics and medicinal chemistry. A suitable screening cascade may comprise the following activities following the initial screen:

    • Selection of statistically relevant actives in the primary screen.

    • Repeating the initial screen with this selection to eliminate statistical false positives. Statistical false positives are an inevitable result of screening several hundred thousand compounds. Rescreening the initial actives and confirming the result is a prerequisite for progression of a compound.

    • Deselection and/or orthogonal assays. Deselection assays use the same assay technology but a different or no molecular target – a positive in this second assay indicates a high probability that a compound is only interfering with the assay technology and should be deprioritized. More preferably, an orthogonal assay measures the activity or binding of the compounds on the same target using a different technology. Activity in an orthogonal assay raises the priority of a hit.

    • Evidence of direct target engagement is a key property of a high-quality hit. Biophysical techniques, for example, surface plasmon resonance, microscale thermophoresis and nuclear magnetic resonance are especially valuable in determining this property. However, balancing throughput of the technique with numbers of compounds to be tested requires that these methods are carefully positioned in the overall triage.

    • Determine a dose–response curve. This will give an estimate of potency but importantly the steepness of the dose–response curve can give an indication if interference mechanisms are having an effect, for example, aggregation [13].

    • Chemical analysis of the sample. A well-curated and maintained compound set should minimize the problems associated with impure or incorrect structures but there are still multiple opportunities for problems to arise. LC–MS probably provides the optimal compromise of throughput and quality of information at this stage.

    • Compounds reaching this point in a triage benefit from visual inspection of the structures by experienced medicinal chemists. Structural issues are often context dependent and so an awareness of potential problems and the requirements of the program is essential.

    • At all stages the triage will benefit from chemoinformatic support. The physicochemical properties of a compound are important for putting the activity in context. Various measures exist, but molecular weight and lipophilicity are probably the simplest and most important, although many other properties could be considered individually or as a derived desirability score, for example, ligand efficiency and lipophilic ligand efficiency [14]. Construction of activity models from the primary activity data enables identification and rescue of possible false negatives and also confirmation of positives. Analyzing the actives against models derived from the large amount of publically available data, for example, ChEMBL, will potentially give preliminary indications of selectivity or toxicity issues. Structural flags identify potential issues and facilitate rapid assessment of large numbers of structures not possible with visual inspection and are less prone to the subjectivity of the analyst. Clustering methods enable structurally related compounds to be brought together, which may identify early indications of SARs.

    • Once the hit compounds have been prioritized, series may be identified or singletons may stand out. In either situation resynthesis of authentic samples is a prerequisite before further work is carried out. Despite the care taken in analyzing screening samples small, amounts of impurities (or large amounts invisible to LC–MS) may have been present or structural changes consistent with the LC–MS may have occurred. Preparation of an authentic sample therefore provides both corroboration of the assigned structure and a source of material for further characterization.

    • Once the activities have been confirmed on a fresh sample an analog program can be initiated. There are many examples where even modest changes in structure lead to ablation of activity and these so-called ‘flagpoles’ of activity do not represent a good starting point for further work. Ideally a series will demonstrate progressive changes in activity with modest structural changes. The goals of this analog program will depend on the overall profile to be achieved and the nature of the starting point. It is likely that increases in activity will be a goal but without compromising measured and calculated physicochemical properties, so tracking the ligand efficiency will be important. Selectivity could be a goal at this point. Cell-based activity is usually an early test of proof-of-concept. Potentially for very promising compounds profiling in in vitro DMPK assays will give another measure of the quality of the hit/hit series.

    • Biostructural information is extremely valuable for guiding the optimization process and an excellent indicator of target engagement. However, the studies must begin early if they are going to impact on the hit identification stage, most likely as the initial HTS is being developed.

    HTS in the academic arena

    The last 15 years has seen considerable growth of drug discovery in the academic arena. It has been reported that a large percentage of USA academic drug discovery centers were established between 2003 and 2008 and there is now an increasing global network of centers [15]. HTS provides leads for a significant percentage of USA academic drug discovery [16], whereas in the UK it has been reported that the availability of HTS resources has limited the approach [17], although a very recent analysis indicates that there are a number of centers available [18].

    Ideally a high-quality compound library should be available together with the infrastructure to screen it and the expertise and facilities to follow-up the resultant output. As can be seen from the description above, the range of required disciplines and techniques is large and the activities need to be carried out in a coordinated manner so that a tailored triage can maximize the value of the screen and the hit series that emerge have sufficient data to justify further investment.

    Pharma companies recognizing the potential of scientific programs initiated in the academic sector are offering opportunities to screen their compound sets and access their expertise in HTS follow-up, for example, GSK Discovery Fast Track Challenge. Projects such as the Innovative Medicines Initiative-funded European Lead Factory (ELF) are also addressing this gap. Bringing together high-quality collections has been proposed to offer a number of advantages to the combined library [19]. The ELF has assembled a Joint European Compound Library (JECL) of over 300,000 compounds originating from seven pharma companies [20] in addition to a growing number (currently approximately 50,000) of novel high-quality compounds specifically prepared for the project [Karawajczyk A et al. Expansion of chemical space for collaborative lead generation and drug discoverythe ELF perspective (2015), Submitted]. The European Screening Centre at ELF has the infrastructure to carefully store, distribute and screen the JECL. In addition it has the screening and chemistry expertise to follow-up the hits and capabilities in biophysical and biostructural techniques to provide the tailored triages of the type described above. Screening of this JECL and follow-up is available to academic and SME groups funded by the Innovative Medicines Initiative.

    HTS has proven to be an effective method to generate excellent starting points for innovative medicines; however, its application is complex and comes with many traps that may catch out the uninitiated. Access to the infrastructure and expertise to run and, critically, follow-up the output from HTSs has the potential to translate innovative biology from our academic sector to new chemical matter that will ultimately result in important new medicines.

    Financial & competing interests disclosure

    P Jones, S McElroy, A Morrison and A Pannifer are funded by a grant provided by Innovative Medicines Initiative Joint Undertaking Grant Agreement 115489. The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

    No writing assistance was utilized in the production of this manuscript.

    References

    • 1 Mullard A. 2014 FDA drug approvals. Nat. Rev. Drug Discov. 14(2), 77–81 (2015).
    • 2 Scannell JW, Blanckley A, Boldon H, Warrington B. Diagnosing the decline in pharmaceutical R&D. Nat. Rev. Drug Discov. 11(3), 191–200 (2012).
    • 3 Paul SM, Mytelka DS, Dunwiddie CT et al. How to improve R&D productivity: the pharmaceutical industry's grand challenge. Nat. Rev. Drug Discov. 9(3), 203–214 (2010).
    • 4 Macarron R, Banks MN, Bojanic D et al. Impact of high-throughput screening in biomedical research. Nat. Rev. Drug Discov. 10(3), 188–195 (2011).
    • 5 Dahlin JL, Walters MA. The essential roles of chemistry in high-throughput screening triage. Future Med. Chem. 6(11), 1265–1290 (2014).
    • 6 Kuhnert M, Koster H, Bartholomaus R et al. Tracing binding modes in hit-to-lead optimization: chameleon-like poses of aspartic protease inhibitors. Angew. Chem. Int. Ed. Engl. 54(9), 2849–2853 (2015).
    • 7 Soares KM, Blackmon N, Shun TY et al. Profiling the NIH small molecule repository for compounds that generate H2O2 by redox cycling in reducing environments. Assay Drug Dev. Technol. 8(2), 152–174 (2010).
    • 8 Hermann JC, Chen Y, Wartchow C et al. Metal impurities cause false positives in high throughput screening campaigns. ACS Med. Chem. Lett. 4(2), 197–200 (2013).
    • 9 Baell J, Walters MA. Chemical con artists foil drug discovery. Nature 513(7519), 481–483 (2014).
    • 10 Seidler J, McGovern SL, Doman TN, Shoichet BK. Identification and prediction of promiscuous aggregating inhibitors among known drugs. J. Med. Chem. 46(21), 4477–4486 (2003).
    • 11 Devine SM, Mulcair MD, Debono CO et al. Promiscuous 2-aminothiazoles (PrATs): a frequent hitting scaffold. J. Med. Chem. 58(3), 1205–1214 (2015).
    • 12 Brem J, van Berkel SS, Aik W et al. Rhodanine hydrolysis leads to potent thioenolate mediated metallo-β-lactamase inhibition. Nat. Chem. 6, 1084–1090 (2014).
    • 13 Shoichet BK. Interpreting steep dose-response curves in early inhibitor discovery. J. Med. Chem. 49(25), 7274–7277 (2006).
    • 14 Hann MM, Keseru G. Finding the sweet spot: the role of nature and nurture in medicinal chemistry. Nat. Rev. Drug Discov. 11(5), 355–365 (2012).
    • 15 Academic Drug Discovery Consortium. www.addconsortium.org
    • 16 Frye S, Crosby M, Edwards T, Juliano R. US academic drug discovery. Nat. Rev. Drug Discov. 10(6), 409–410 (2011).
    • 17 Tralau-Stewart C, Low CMR, Marlin N. UK academic drug discovery. Nat. Rev. Drug Discov. 13(1), 15–16 (2014).
    • 18 Shanks E, Ketteler R, Ebner D. Academic drug discovery within the United Kingdom: a reassessment. Nat. Rev. Drug Discov. 14(7), 510–513 (2015).
    • 19 Kogej T, Blomberg N, Greasley P et al. Big pharma screening collections: more of the same or unique libraries? The AstraZeneca–Bayer Pharma AG case. Drug Discov. Today 18(19–20), 1014–1024 (2013).
    • 20 Besnard J, Jones PS, Hopkins AL, Pannifer AD. The Joint European Compound Library: boosting precompetitive research. Drug Discov. Today 20(2), 181–186 (2015).