Ƶ

[Skip to Navigation]
Sign In
Editorial
7, 2024

Late-Stage Cancer End Points to Speed Cancer Screening Clinical Trials—Not So Fast

Author Affiliations
  • 1DELFI Diagnostics Inc, Baltimore, Maryland
JAMA. Published online April 7, 2024. doi:10.1001/jama.2024.5821

In this issue of JAMA, Feng et al studied whether late-stage cancer (ie, stage III or stage IV cancer), rather than cancer-specific mortality, was an acceptable alternative end point in clinical trials of cancer screening.1 The authors analyzed 41 clinical trials conducted in Europe, North America, and Asia, combining the data overall and according to cancer type. They evaluated the association between incidence of stage III-IV cancer and cancer-specific mortality in and across the selected studies.

To understand the importance of the study by Feng et al, consider the accepted rationale for cancer screening. Cancer is commonly fatal, and outcomes are better for patients diagnosed with early-stage cancer than for those diagnosed with late-stage cancer. Therefore, detecting cancer at an early stage, rather than a late stage, should presumably save lives.

Feng et al addressed the proposition that if cancer screening increases rates of early cancer detection, then rates of late-stage cancer detection will be reduced, resulting in reduced death rates from cancer. To date, the evidence regarding this proposition has been mixed. Randomized clinical trials in both colorectal cancer and lung cancer have demonstrated that screening is associated with an increase in early-stage cancer diagnosis, a reduction in late-stage diagnosis, and lower cancer mortality rates.2,3 In contrast, screening for ovarian cancer is associated with an increase in early-stage diagnoses and reduction in late-stage diagnoses, but not with lower rates of cancer mortality.4

A new generation of blood-based tests that can detect cancer (termed liquid biopsies) are now both available to clinicians and undergoing evaluation in clinical trials. Some of these screening tests focus on cancers for which screening is currently recommended and are available only to people eligible for that screening.5,6 Others detect a large number of cancers and are marketed to the general population.7 Screening tests that could detect multiple cancers could conceivably lower mortality for some cancers for which screening is not currently recommended. However, there is no evidence that these blood tests that aim to detect multiple cancers lower cancer-related mortality.

GRAIL, LLC has initiated 2 clinical trials to examine the effect of its multiple cancer screening test on patient outcomes. First, a randomized clinical trial of 140 000 participants conducted with the UK’s National Health Service will assess whether a blood test for multiple cancers reduces the primary outcome of the number of stage III and IV cancer diagnoses.8 The second clinical trial, the Galleri-Medicare study, will include 50 000 Medicare beneficiaries who will undergo testing with the multiple cancer screening test. The press release for the study lists the outcome measure of interest as “reduction in diagnosed stage IV cancers.” The ClinicalTrials.gov listing for the study has been redacted.9,10 Thus, it appears that neither study has a primary end point of cancer mortality.

Randomized clinical trials of cancer screening rely on the “gold standard” outcome of cancer-specific mortality because other end points, such as survival duration after diagnosis, are biased by lead time and overdiagnosis, which can appear to improve outcomes but do not actually alter mortality rates. Other end points, such as life expectancy and patient quality of life, are also important and are collected or modeled as part of screening assessments.

Late-stage cancer has not been validated as an acceptable alternative outcome to cancer-related mortality. The potential advantage of late-stage cancer as an end point is that it can shorten the duration of a clinical trial evaluating cancer screening tests because late-stage cancer occurs before death from cancer.

In the analyses by Feng et al, strong positive correlations between late-stage cancer end points and cancer-specific mortality end points were observed only in studies of lung cancer screening (Pearson ρ = 0.92 [95% CI, 0.72-0.98]), and poor correlations were found across studies of colorectal cancer (Pearson ρ = 0.39 [95% CI, 0.27-0.80]) and prostate cancer (Pearson ρ = −0.69 [95% CI, −0.99 to 0.81]). Although the correlation statistic appeared strongly positive across studies of ovarian cancer (Pearson ρ = 0.92 [95% CI, 0.51-1.00]), the finding is less convincing than for lung cancer because it relies on only 4 data points and has a wider CI.

Feng et al1 addressed 3 important questions regarding the relation of an alternative end point, the frequency of late-stage cancer diagnosis, to the accepted end point of cancer-related mortality.

First, if the alternative end point (ie, late stage cancer) can be substituted for the accepted end point (ie, cancer-related mortality), then results based on the alternative end point should rarely disagree with results using the accepted end point. There are at least 2 problems that could arise: the alternative could find a benefit that was not there when the accepted end point was examined or the alternative could fail to find a benefit that was present when the accepted end point was examined. These mismatched results bear resemblance to type I and type II errors in clinical studies, which generally have tolerance thresholds of 5% and 10% to 20%, respectively. Feng et al reported that findings based on the alternative end point regularly produced incorrect results. In 62% of the comparisons, the alternative end point of late-stage cancer incorrectly predicted that screening was beneficial when the accepted end point of cancer-related mortality showed that it was not. In 44% of the comparisons, the alternative end point failed to identify a benefit that was apparent in analyses of the accepted end point. The authors showed these findings were similar at statistical cutoffs of significance of α = .05 and α = .10

Second, studies of cancer screening tests also typically quantify the benefit of screening in a specific population. Because most people screened do not have the disease they were screened for, a typical measure of population benefit is the number needed to screen to prevent a death.11 The calculation requires an accurate measure of the mortality benefit (ie, the accepted end point) associated with screening. Although changes in the frequency of the alternative end point need not be exactly interchangeable with those of the accepted end point, the relationship must be consistent. Otherwise, one cannot accurately predict the magnitude of mortality benefit from the magnitude of the reduction in advanced cancer diagnoses. However, Feng et al reported strong statistical evidence that the 2 end points did not reliably correspond with each other (P = .004). Therefore, studies that rely on the alternative end point will not produce data sufficient for basic assessments of the risk-benefit tradeoff of the screening approach.

Third, the work by Feng et al assessed the correlation between the 2 end points across different cancer types in their analyses. A consistent correlation would be required to assume that the relation between cancer-specific mortality and late-stage cancer diagnoses is similar for the many other cancers that multiple cancer tests aim to detect. Feng et al evaluated whether the relation between the alternative and accepted end points were sufficiently consistent across cancers (including lung cancer, colorectal cancer, and breast cancer) that one could reasonably assume that the relation between the 2 end points would be similar for other cancers, such as gastric cancer, esophageal cancer, sarcoma, and lymphoma, for which evidence is lacking that screening will reduce mortality. However, Feng et al reported meaningful differences in the correspondence between the 2 end points for different cancers (ie, statistically significant heterogeneity; P = .02). Therefore, one cannot assume that one could substitute the alternative end point for the accepted end point for the other cancers that these tests may detect.8 This finding is not surprising, because cancer staging categories are neither developed nor evaluated with an expectation that they are similar across cancer types.

The study by Feng et al is a meta-analysis of multiple randomized clinical trials, which is considered a high level of objective evidence, that did not require assumptions typical of modeling studies.12 However, the analyses did not delineate why these 2 end points yield different results for different cancers. One reason might be that the 2 outcomes reflect different phenomena. The outcome of shifts in stage at diagnosis reflects diagnostic performance of the screening test. In contrast, reductions in cancer-specific mortality reflect the effects of treatment for the cancer. Changes in characteristics of screening tests or the efficacy of cancer treatments could change the relation between the 2 end points. The work by Feng et al showed that cancer-related mortality remains the most appropriate end point for clinical evaluation of the new blood-based tests that aim to detect many cancers for which there is no evidence that screening is beneficial. Studies might take a little longer, but will yield a more reliable answer.

Back to top
Article Information

Corresponding Author: Peter B. Bach, MD, MAPP, 2809 Boston St, Baltimore, MD 21224 (peter.bach@delfidiagnostics.com).

Published Online: April 7, 2024. doi:10.1001/jama.2024.5821

Correction: This article was corrected on May 1, 2024, to correct an error in which the minus sign for the Pearson coefficient for prostate cancer was omitted.

Conflict of Interest Disclosures: Dr Bach reported stock ownership in and being company executive of DELFI Diagnostics, a for-profit company that is developing blood-based cancer screening tests, including a test for lung cancer screening, and having a patent for PCT/US2023/022104 and PCT/US2023/034705 pending.

References
1.
Feng  X, Zahed  H, Onwuka  J,  et al.  Cancer stage compared with mortality as end points in randomized clinical trials of cancer screening: a systematic review and meta-analysis.  Ѵ. Published online April 7, 2024. doi:
2.
Bretthauer  M, Løberg  M, Wieszczy  P,  et al; NordICC Study Group.  Effect of colonoscopy screening on risks of colorectal cancer and related death.   N Engl J Med. 2022;387(17):1547-1556. Published online 2022. doi:
3.
Aberle  DR, Adams  AM, Berg  CD,  et al; National Lung Screening Trial Research Team.  Reduced lung-cancer mortality with low-dose computed tomographic screening.   N Engl J Med. 2011;365(5):395-409. doi:
4.
Menon  U, Gentry-Maharaj  A, Burnell  M,  et al.  Ovarian cancer population screening and mortality after long-term follow-up in the UK Collaborative Trial of Ovarian Cancer Screening (UKCTOCS): a randomised controlled trial.  Գ. 2021;397(10290):2182-2193. doi:
5.
Chung  DC, Gray  DM  II, Singh  H,  et al.  A cell-free DNA blood-based test for colorectal cancer screening.   N Engl J Med. 2024;390(11):973-983. doi:
6.
Mathios  D, Johansen  JS, Cristiano  S,  et al.  Detection and characterization of lung cancer using cell-free DNA fragmentomes.   Nat Commun. 2021;12(1):5060. doi:
7.
Klein  EA, Richards  D, Cohn  A,  et al.  Clinical validation of a targeted methylation-based multi-cancer early detection test using an independent validation set.   Ann Oncol. 2021;32(9):1167-1177. doi:
8.
Does screening with the Galleri test in the NHS Reduce the likelihood of a late-stage cancer diagnosis in an asymptomatic population? a randomised clinical trial (NHS-Galleri). ClinicalTrials.gov. Accessed March 12, 2024.
9.
GRAIL to initiate REACH study to evaluate clinical impact of Galleri Multi-Cancer Early Detection (MCED) test among the Medicare population. GRAIL. November 20, 2023. Accessed March 12, 2024.
10.
[Trial of device that is not approved or cleared by the U.S. FDA]. ClinicalTrials.gov. Accessed March 12, 2024.
11.
Rembold  CM.  Number needed to screen: development of a statistic for disease screening.  Ѵ. 1998;317(7154):307-312. doi:
12.
Berkman  ND, Lohr  KN, Ansari  MT,  et al.  Grading the strength of a body of evidence when assessing health care interventions: an EPC update.   J Clin Epidemiol. 2015;68(11):1312-1324. doi:
×