We use cookies to improve your experience. By continuing to browse this site, you accept our cookie policy.×
EditorialFree Access

What is going on with my samples? A general approach to parallelism assessment and data interpretation for biomarker ligand-binding assays

    Shawn Ciotti

    Translation Medicine, Biogen Idec, Inc., MA 02142, USA

    ,
    Shobha Purushothama

    Translation Medicine, Biogen Idec, Inc., MA 02142, USA

    &
    Soma Ray

    * Author for correspondence

    Translation Medicine, Biogen Idec, Inc., 14 Cambridge Center, Cambridge, MA 02142, USA.

    Published Online:https://doi.org/10.4155/bio.13.174

    Method development for biomarkers presents many potential challenges including: the absence of suitable reference material, suitable analytical reagents and matrix free from endogenous analyte. Collectively, these challenges can directly affect the quality of the bioanalytical data. Understanding exactly what is being measured and its biological relevance is critical in determining the limitations of the data generated using these assays [1].

    As most biomarker assays are developed to be fit-for-purpose, there are no consistent industry practices and, hence, practices can vary widely from stringent quantitative PK assays, to assays with minimal precision and accuracy. For this reason, data from many biomarker assays may very well be considered exploratory and relative to pretreatment levels.

    Parallelism assessment versus dilutional linearity

    One key parameter in biomarker method development is the assessment of parallelism. Proper parallelism assessment for biomarker assays during development and validation enables appropriate data interpretation. Also, while performing parallelism assessment, one could gather clues regarding other key assay parameters such as selectivity, sensitivity and assay minimum required dilution (MRD). The next questions that come to mind include: is parallelism assessed or established? What can the parallelism experiment tell us about the biomarker of interest and how we use the assay? Can parallelism assessment be used as a different method for determining assay selectivity?

    Due to some of the challenges alluded to above, the parallelism experiment is one of the most critical assessments in biomarker assay development as it serves several purposes: measuring endogenous level of the biomarker, the proportionality between the endogenous form of the biomarker and the reference material in buffer; and the identification of the biological sensitivity [1]. Hence, given that there is no true reference material and often a substitute matrix is used for the calibrators, parallelism as a parameter is in practice ‘assessed’ rather than ‘established’.

    In principle, a parallelism assessment is akin to a dilutional linearity experiment. The major difference between dilutional linearity and parallelism is that dilutional linearity employs control samples with a known quantity of analyte spiked into analyte-free matrix, while parallelism is performed by serial dilution of incurred samples. For biomarker assays, the parallelism experiment enables an assessment of the performance of the assay with respect to endogenous analyte upon dilution. In an ideal situation, samples assessed in the parallelism experiment have high enough concentrations to allow measurement of multiple dilutions that span the assay range, but this is often not possible when endogenous levels are low.

    Parallelism & its application

    Parallelism assessment, MRD determination and assay sensitivity for the endogenous analyte go hand-in-hand, and are typically evaluated in the same experiment. When we perform the parallelism/MRD experiment, ten normal and, if available, ten disease samples are diluted in assay buffer to at least six dilutions in total. One way to then perform parallelism assessment would be to compare the back-calculated concentrations with the concentration measured at the lowest dilution (or the neat sample, if tested), which itself is used as the ‘nominal’ for that sample. One can then measure the percentage nominal amongst the recovered concentrations at the different dilutions. A deviation from expected assay performance (e.g., >20% deviation in nominal values) indicates a loss of parallelism. For the parallelism assessment, only samples yielding results in range should be used. For example, if three out of 20 samples were below the limit of quantitation, the parallelism assessment will only include the 17 detectable samples. Also, the overall trend of all of the tested samples should be noted, for example, if recovery appears to be increasing with increasing dilution. If none of the starting dilutions are in assay range (i.e., below the limit of quantitation) or if no evidence of parallelism is present among any dilutions tested, then the MRD has probably not been fully established and it will have to be readdressed.

    This practice comes with its own caveats – how many samples do I need to get a good understanding of parallelism? What criteria is fit for my given purpose, ±20% or ±30% or something even less stringent – the onus is on us as bioanalytical scientists to use good scientific judgment based on our understanding of the biomarker biology and the decisions the assay data will enable; in other words, is it intended to be exploratory or in support of safety/efficacy end-points.

    Sometimes the lucky development scientist has access to high-quality reagents for a highly abundant endogenous protein biomarker. In one such example, a well-known and well-characterized biomarker could be detected using a commercially available kit. However, as the exact MRD was not known in our matrix of interest, a parallelism experiment was performed using ten disease matrices (samples) at several dilutions between 1:30 and 1:480. The data showed good parallelism for the first three dilutions in eight out of ten samples, indicating lack of matrix effects. Therefore, we were able to conclude the MRD for this assay was 1:30 and that no interfering components in the matrix were present. Applying acceptance criteria for the QCs to the parallelism samples, we could also determine the lowest acceptable level of detectable analyte in the sample (biological sensitivity). Therefore, in one experiment, four parameters (MRD, parallelism, selectivity and sensitivity) were assessed.

    But what happens if the samples tested measure near the LLOD of the assay? This happens quite frequently; in these cases, a true parallelism assessment is not possible. As a last resort, one may consider using the dilutional linearity assessment with spiked recombinant or purified protein that is being used for the preparation of the standard curve. However, as this is not a true parallelism assessment, it is not representative of the endogenous analyte and the results may be uninformative or misleading. In this case, one could also consider an in-study parallelism assessment if the biomarker is anticipated to increase in concentration upon treatment.

    One such example for this type of challenge was the development of an assay for a low abundant protein that is known to aggregate. During a parallelism experiment with ten diseased cerebrospinal fluid samples, it was noted that only three of the ten samples (not surprisingly!) produced detectable levels of the analyte. Of the three samples, none demonstrated parallelism when diluted past the assay’s MRD. Additionally, due to the nature of the protein to form aggregates as well as the lack of good quality assay reagents (including a suitable reference standard), spiking of the purified reference control into samples was attempted as a last attempt to artificially reproduce parallelism by using dilutional linearity. This resulted in wild fluctuations of percentage recoveries. For this assay, parallelism was not achieved. However, based on our understanding of the biology of this biomarker and given that the biomarker data were to be used for trend analysis only, the assay was qualified for testing clinical samples.

    Based on the assessment of the parallelism as described in the above examples, biomarker assays can be classified as either relative quantitative or qualitative. Relative-quantitative as defined by Lee et al., is a method which uses calibrators to calculate the values for unknown samples [2]. The quantification is considered relative because the reference standard is either not well characterized, not available in a pure form, or is not fully representative of the endogenous biomarker [2], as illustrated by our first example. However, when the assay read-out does not have a continuous proportionality relationship to the amount of analyte in a sample, or the data are categorical in nature, as discussed in the second example, the assay should be considered purely qualitative [2].

    Why does parallelism matter?

    The examples discussed have highlighted the multiple uses of parallelism assessment: MRD determination, sensitivity and selectivity. Arguably, if the parallelism assessment performed with multiple individual matrix samples has proven successful, then selectivity has been effectively demonstrated. In such cases, the rationale for performing another selectivity assessment by spiking a non-endogenous recombinant or purified protein to assess matrix interference is unclear, and such an assessment would not be reflective of the true sample.

    In conclusion, we believe that parallelism evaluations are invaluable when developing biomarker assays and here we have shared some of the practices we have employed. However, since every biomarker is different and each assay development comes with its own challenges, we are still in the process of learning. Collectively sharing our experiences of parallelism assessment will enhance our understanding of biomarker method development.

    Acknowledgements

    The authors would like to thank L Stevenson and L Amaravadi for scientific discussions and review of the manuscript.

    Financial & competing interests disclosure

    The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.

    No writing assistance was utilized in the production of this manuscript.

    References

    • Valentin MA, Ma S, Zhao A, Legay F, Avrameas A. Validation of immunoassay for protein biomarkers: bioanalytical study plan implementation to support pre-clinical and clinical studies. J. Pharm. Biomed. Anal.55(5),869–877 (2011).
    • Lee JW, Devanarayan V, Barrett YC et al. Fit-for-purpose method development and validation for successful biomarker measurement. Pharm. Res.23(2),312–328 (2006).