About the Author(s)


Verena Gounden symbol
Department of Clinical Biochemistry, Galway University Hospital, Newcastle Road, Galway, Ireland

Nareshni Moodley Email symbol
Department of Chemical Pathology, National Health Laboratory Service, Inkosi Albert Luthuli Central Hospital, Durban, South Africa

Discipline of Chemical Pathology, School of Laboratory Medicine and Medical Sciences, Faculty of Laboratory Medicine, University of KwaZulu-Natal, Durban, South Africa

Citation


Gounden V, Moodley N. What’s in a number: Falling on the sword of cut-off points and reference limits?. J Coll Med S Afr. 2025;3(1), a148. https://doi.org/10.4102/jcmsa.v3i1.148

Opinion Paper

What’s in a number: Falling on the sword of cut-off points and reference limits?

Verena Gounden, Nareshni Moodley

Received: 21 Oct. 2024; Accepted: 20 May 2025; Published: 30 June 2025

Copyright: © 2025. The Author(s). Licensee: AOSIS.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

The marked increase in laboratory test volumes and costs internationally emphasises the need for demand management. One way that this can be implemented is by reducing unnecessary repeat testing with the provision of appropriate decision cut-off points (clinical decision limits [CDLs]) or reference intervals (RIs) with subsequent correct interpretation of laboratory results. The derivation of RIs and CDLs are fraught with technical and biological challenges. There is difficulty in conducting labour-intensive, costly, long, and complex studies, which require healthy volunteers that represent things such as different age groups, genders, races, and alternate states of health (e.g. pregnancy) within the population. It is also inappropriate to apply RIs or cut-off points from other populations, which is often what occurs when manufacturer-expected values are used. Lack of standardisation of international guidelines for CDLs and analytical methods poses a further problem. The effect of analytical and biological variation on results is also essential to consider when interpreting results. These make ideal RIs and CDLs difficult to attain and implement despite their critical need.

Keywords: reference intervals; clinical decision limits; interpretation; laboratory; biological variation.

Introduction

The central basis of interpretation of quantitative numerical laboratory testing, particularly in biochemistry, which contributes to the majority of testing requests, is the comparison of the patients’ results to a clinical decision limit (CDL) or reference interval (RI). All laboratory users need to have context as to how these intervals and decision limits are derived and the uncertainty around the number generated by a laboratory test.

Reference intervals and clinical decision limits

In order for the clinician to adequately interpret a patient’s result, it is essential for the laboratory to provide a defined range of values – the RI for comparison or a cut-off point value (CDL) associated with disease or the pathological state. Appropriate interpretation of the laboratory result would facilitate the safe and effective management of the patient or individual.

The RI usually comprises 95% of the test results obtained from a presumed healthy population. This implies that 2.5% of healthy people (1 in 40) will have test results outside the RI. There is significant complexity in the process of deriving an RI. The difficult and expensive process of determining adequate RIs means that most individual laboratories are unable to derive their own RIs. RIs should be created by utilising a statistically sufficient group (a minimum of 120) of healthy reference subjects. The Clinical and Laboratory Standards Institute (CLSI) guideline does acknowledge that ‘Health is a relative condition lacking a universal definition. Defining what is considered healthy becomes the initial problem in any study …’. In addition, some of the selected participants might have subclinical disease. Furthermore, participant recruitment and getting informed consent maybe costly, time-consuming, and challenging. Participants need to cover different age groups (e.g., paediatric and geriatric patients), rare samples (e.g., cerebrospinal fluid), timed collections, dynamic function testing and serial analysis. Once selection and obtaining participants has been finalised, selection of the appropriate statistical technique needs to be considered. Choice of different techniques, for example parametric, transformed parametric, nonparametric, can affect the RIs generated. Data are often multimodal or asymmetrically distributed, with required partitioning of test subjects by sex, age, race, and other factors, which add to the complexity of the process.1

Many medical laboratories utilise RIs largely derived from American and European populations that may not be appropriate for the local population served by the lab.2 This may be acceptable for some tests but not for others. For example, Vitamin D serum levels vary in individuals, where factors such as geography and race can play a more significant role.3 In these instances decision limits based on clinical outcomes would be useful.

Advances in computing technology have allowed for the derivation of RIs using indirect methods via databases of laboratory results. However, this is still a complex procedure requiring adequate data mining and statistical knowledge. An advantage is that the data are readily available with resultant time and cost savings. Studies utilising this approach were able to report clinically relevant and useful RIs. These used various complex filters to eliminate results from ‘unhealthy’ subjects. Data for the derivation of the RI would include results from both hospitalised patients, patients from outpatient departments and from the general population attending their primary care provider. The majority of these studies used complex statistical algorithms for derivation of the final RIs. Current guidelines do not recommend these methods as an initial approach for establishing RIs as there is a possibility that a significant proportion of the data might not actually originate from true healthy individuals.1

Reference interval data for regions in Africa were analysed in a systematic review by Price et al.4 in 2022, where they searched PubMed for RIs from Africa published since 2010, focussing on clinical analytic chemistry, haematology and immunological parameters. Data from adults, adolescents, children, pregnant women, and the elderly were included, with exclusion of manuscripts reporting data from persons with conditions that may not classify them as healthy. Of the 179 identified manuscripts, 80 were included in this review, covering 20 countries with the largest number of studies in Ethiopia (n = 23, 29%). Most studies considered healthy, non-pregnant adults (n = 55, 69%). Nine (11%) studies included pregnant women, 13 (16%) included adolescents and 22 (28%) included children. There is a scarcity of published, thoroughly conducted RI data available from sub-Saharan Africa with insufficient regional heterogeneity (almost one third of studies were from Ethiopia). Many studies had limited the appropriateness of use of their derived RIs because of the presence of bias. Price et al. highlighted issues identified with the studies reviewed. This included the majority of studies being under-representative of groups in the local population (e.g., the elderly) or had inconsistent definitions of health. Moreover, almost half of the studies did not cite the CLSI guidelines as their reference point for the method of RI derivation. The parameters measured were varied, with different analytical platforms utilised. Almost all studies agreed that regionally appropriate RIs are important, but many did not contain strong, reliable evidence for this.4

Unlike RIs, cut-off points or CDLs are associated with a risk of specific adverse outcomes. Commonly used examples of CDLs include lipid parameters, glucose, haemoglobin A1c (HbA1c), and cardiac markers such as Troponin, to determine the risk of disease, to diagnose or to treat. These decision limits may be based on findings of clinical studies or expert consensus.5 The seemingly simple plasma glucose CDLs for the diagnosis of diabetes mellitus (DM) is perhaps a textbook example of the complexities and caveats around deriving CDLs. In 1979, the American National Diabetes Data Group recommended one set of criteria for the diagnosis of DM. These were based on the results of three relatively small studies in which cohorts of individuals without diabetic retinopathy were subjected to oral glucose tolerance tests and then followed for a period of 3–8 years to determine which patients developed retinopathy. The World Health Organization (WHO) adopted some slightly modified criteria in 1980. In 1997, the American Diabetes Association (ADA), via an expert panel, revisited these criteria and decided to lower fasting plasma glucose (FPG) from ≥ 7.8 mmol/L to ≥ 7.0 mmol/L. At the time the decision was made to retain ≥ 11.1 mmol/L as the 2-h oral glucose tolerance test (OGTT) value to prevent disruption, as many large epidemiological studies had used this value to define diabetes. Several studies have also demonstrated that these existing cut-offs for FPG and 2-h OGTT glucose (OGTTG) are not equivalent in the diagnosis of diabetes. The same patient may at the same time have an elevated 2-h OGTTG while having ‘normal’ FPG. Furthermore, both the ADA (< 5.6 mmol/L) and WHO (< 6.1 mmol/L) have different decision limits for what is considered a normal fasting glucose.6 Throw into the mix, HbA1c as a diagnostic test for DM, with its method-dependent issues with haemoglobin variants, affectation by common disorders such as iron deficiency, etc., and the waters can seem even muddier. There is no safe level of euglycaemia that is biologically viable at which microvascular changes will never occur.7

The 99th percentile upper reference limit of cardiac troponin has been utilised for over 20 years as an assay-specific threshold for the diagnosis of myocardial infarction however there is no standardisation in the calculation and reporting of the 99th percentile. Assay imprecision can affect thresholds defined and more significantly the ‘healthy’ population used to define those 99th percentile limits.8 These examples highlight that the disease states are a continuum with health – with a proverbial line in the sand often being drawn at the point where the existing evidence shows a significant risk.

Clinical decision limits should ideally be derived from evidence-based studies however; many that are currently in routine use in laboratories may be historical and based on expert or consensus opinion.5 Clinical studies may determine specific CDLs for a test to predict or diagnose disease.9 That being said laboratorians review both RIs and CDLs on an ongoing basis. This is a process of quality improvement to update these values based on the latest evidence-based literature and/or recommended guidelines.

Both RIs and CDLs are also dependent on the analytical method used to derive them and may not be interchangeable across different methodologies or analytical platforms. Hence, these need to be verified by the laboratory for their analytical method. Since the early 2000s, there have been substantial moves to harmonise routine biochemistry assays with one of the expectations being that this would also allow for common RIs and CDLs. While, progress has been made in this area, the lack of assay harmonisation remains a considerable obstacle.5

Biological variation and analytical variation

Biological variation

Another important concept to understand the factors that influence the number generated by a test is the concept of biological variation (BV). Biological variation refers to variability in the concentration or activity of the substance being measured (measurand) around a homeostatic set point. This variability is because of a host of innate physiological factors within an individual and may display predictable daily, monthly, or seasonal biological rhythms. Different individuals may have different set points.10,11

Analytical variation

The analytical variation (AV) represents the imprecision of the test assay because of differences in testing methods and equipment. It is important to acknowledge that there can be variations even in the same sample run by the same instrument at close time intervals, as all analytical techniques have inherent random variation.12 This imprecision is expressed as the coefficient of variation of the analyser (CVa). Coefficient of variation of the analyser is a metric that labs should be monitoring, as there are guidelines for most assays with regard to what is desirable, optimal, and minimal allowable CVa.

The uncertainty around the true value of a laboratory result is a combination of the BV within that individual and the CVa of the test method within that specific laboratory. This combination of BV and AV is important to quantify so that one is able to assess whether a change in a patient’s results over time represents true pathology or just ‘background noise’. All ISO 15189-accredited medical laboratories are required to determine the uncertainty associated with all their quantitative assays and have these available to clinicians and laboratory users at their request.13 It is likely that the vast majority of clinicians are not aware of this measure to assess the uncertainty of a test result, or that they can request this information from their local laboratory.

Another way of assessing if a change in a patient’s result represents a true change in condition or response to management is to determine the reference change value. Laboratories should be able to provide this information to the clinical team and online calculators are available (however, the lab would still need to provide the CVa information).14

Ideally, every individual would have a ‘biological passport’ with personalised RIs (pRI) that have been calculated using the individual’s previous test results obtained in a steady-state situation. These would be continually updated as new results become available and be reported with each result. Currently, this is not practical or possible in most laboratories because of a lack of adequate lab and hospital information systems and electronic health records. The pRI is dependent on the accumulation of an adequate number of results (depending on the model utilised) in the relatively well individual.15 This means that the individual would have adequate access to healthcare facilities, laboratory services especially for wellness testing/screening, which is not always possible, particularly in resource-limited settings.

The boundaries between health and pathology are grey zones, while clinical medicine often demands clearer delineation. There is no one true value – a single laboratory result represents a range of possible values. The laboratories’ imperative is to keep that range as limited as possible to ensure the clinical utility of the test. Clinicians are also required to demonstrate some level of flexibility in the interpretation of test results in addition to reviewing against the combination of the prevalence of the disease in question and the history and clinical features of the individual. Much work is still required to overcome the complexities surrounding establishment of RIs and CDLs.

Acknowledgements

Competing interests

The authors declare that they have no financial or personal relationships that may have inappropriately influenced them in writing this article. The author, N.M., serves as an editorial board member of this journal, Journal of the Colleges of Medicine of South Africa. The peer review process for this submission was handled independently, and the authors had no involvement in the editorial decision-making process for this manuscript. The author, N.M., has no other competing interests to declare.

Authors’ contributions

V.G. conceptualised the article, and both V.G. and N.M. wrote the first draft, as well as subsequent reviews.

Ethical considerations

This article followed all ethical standards for research and has no direct impact on patients’ outcomes and will not affect their treatment.

Funding information

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sectors.

Data availability

Data sharing is not applicable to this article as no new data were created or analysed in this study.

Disclaimer

The views and opinions expressed in this article are those of the authors and are the product of professional research. They do not necessarily reflect the official policy or position of any affiliated institution, funder, agency, or that of the publisher. The authors are responsible for this article’s results, findings, and content.

References

  1. Katayev A, Balciza C, Seccombe DW. Establishing reference intervals for clinical laboratory test results: is there a better way?. Am J Clin Pathol. 2010;133(2):180–186. https://doi.org/10.1309/AJCPN5BMTSF1CDYP
  2. Schmidt BM, Tameris M, Geldenhuys H, et al. Comparison of haematology and biochemistry parameters in healthy South African infants with laboratory reference intervals. Trop Med Int Health. 2018;23(1):63–68. https://doi.org/10.1111/tmi.13009
  3. Lips P. Vitamin D status and nutrition in Europe and Asia. J Steroid Biochem Mol Biol. 2007;103(3–5):620–625. https://doi.org/10.1016/j.jsbmb.2006.12.076
  4. Price MA, Fast PE, Mshai M, et al. Region-specific laboratory reference intervals are important: A systematic review of the data from Africa. PLoS Glob Public Health. 2022;2(11):e0000783. https://doi.org/10.1371/journal.pgph.0000783
  5. Ozarda Y, Sikaris K, Streichert T, Macri J, IFCC Committee on Reference intervals and Decision Limits (C-RIDL). Distinguishing reference intervals and clinical decision limits – A review by the IFCC committee on reference intervals and decision limits. Crit Rev Clin Lab Sci. 2018;55(6):420–431. https://doi.org/10.1080/10408363.2018.1482256
  6. Parappil A, Doi SA, Al-Shoumer KA. Diagnostic criteria for diabetes revisited: Making use of combined criteria. BMC Endocr Disord. 2002;2:1. https://doi.org/10.1186/1472-6823-2-1
  7. Emanuelsson F, Marott S, Tybjærg-Hansen A, Nordestgaard BG, Benn M. Impact of glucose level on micro-and macrovascular disease in the general population: A Mendelian randomization study. Diabetes Care. 2020;43:894–902. https://doi.org/10.2337/dc19-1850
  8. Sandoval Y, Apple FS, Saenger AK, O’Collinson P, Wu AHB, Jaffe AS. The 99th percentile upper-reference limit of cardiac troponin and the diagnosis of acute myocardial infarction. Clin Chem. 2020;66(9):1167–1180. https://doi.org/10.1093/clinchem/hvaa158
  9. Hyohdoh Y, Hatakeyama Y, Okuhara Y. A simple method to identify real-world clinical decision intervals of laboratory tests from clinical data. Inform Med Unlocked. 2001:23:100512. https://doi.org/10.1016/j.imu.2021.100512
  10. Flatland B, Baral RM, Freeman KP. Current and emerging concepts in biological and analytical variation applied in clinical practice. J Vet Intern Med. 2021;34(6):2691–2700. https://doi.org/10.1111/jvim.15929
  11. Badrick T. Biological variation. Understanding why it is so important. Pract Lab Med. 2021;4:e00199. https://doi.org/10.1016/j.plabm.2020.e00199
  12. Pradhan S, Gautam K, Pant V. Variation in laboratory reports: Causes other than laboratory error. JNMA J Nepal Med Assoc. 2022;60(246):222–224. https://doi.org/10.31729/jnma.6022
  13. ISO 15189:2022 medical laboratories requirements for quality and competence; 2022 [homepage on the Internet]. [cited 2024 Sep 30]. Available from: www.iso.org
  14. McCormack JP, Holmes DT. Practice pointer: Your results may vary: the imprecision of medical measurements. Br Med J. 2020;368:m1249. https://doi.org/10.1136/bmj.m149
  15. Coskun A, Sandberg S, Unsal I, Fulya YG, et al. Personalized reference intervals – Statistical approaches and considerations. Clin Chem Lab Med. 2022;60(4):629–635. https://doi.org/10.1515/cclm-2021-1066


Crossref Citations

No related citations found.