Survival Analysis

Yuwei BaoMarch 28, 2024

Punchline

For time-to-event data:

  1. Kaplan-Meier (KM) curves — visualize survival probability over time for each group
  2. Log-rank test — test whether the curves are statistically significantly different
  3. Cox proportional hazards model — estimate the hazard ratio (HR)
    • Check PH assumption: Schoenfeld residuals plot + log-log survival plot
    • If violated: use Restricted Mean Survival Time (RMST) instead
  4. HR < 1 = treatment group has better survival

Definition

Survival analysis refers to a class of statistical methods for analyzing time-to-event data, where the outcome is the time from a defined origin (e.g., randomization) to the occurrence of a specified event (e.g., death, disease progression).

Key characteristics of survival data:

  • The outcome is time until an event, not just whether the event occurs
  • Data are often censored, meaning the exact event time is not observed for all individuals

Kaplan–Meier Estimation and Log-Rank Test

Kaplan–Meier Estimator

The Kaplan–Meier (KM) method is a nonparametric estimator of the survival function: [ S(t) = P(T > t) ]

  • It produces a step function that decreases at observed event times
  • It accounts for right-censored data, allowing inclusion of patients with incomplete follow-up
  • The KM curve provides an estimate of the probability of remaining event-free over time

This is the standard method for visualizing survival outcomes in clinical trials.


Censoring

Censoring occurs when the exact event time is unknown but is known to exceed a certain time.

Common types:

  • Administrative censoring: study ends before the event occurs
  • Loss to follow-up: patient exits the study early

A key assumption:

Censoring is non-informative, meaning censored individuals have the same future risk as those remaining under observation.


Log-Rank Test

The log-rank test is a nonparametric hypothesis test used to compare survival distributions between groups.

  • Null hypothesis: survival functions are identical across groups
  • It compares observed vs expected events over time
  • Most powerful when the proportional hazards assumption holds

Cox Proportional Hazards Model

The Cox model is a semi-parametric regression model for time-to-event data:

[ h(t \mid X) = h_0(t)\exp(\beta X) ]

  • ( h(t \mid X) ): hazard function (instantaneous event rate at time (t))
  • ( h_0(t) ): baseline hazard (unspecified)
  • ( \beta ): regression coefficients

Key concepts

  • Hazard: instantaneous risk of experiencing the event at time (t), given survival up to (t)
  • Hazard ratio (HR): relative hazard between groups
  • Risk set: individuals still at risk of the event just prior to time (t)

Assumption

The Cox model assumes proportional hazards, meaning the hazard ratio between groups is constant over time.


Clinical Context (Example: Oncology)

In oncology, survival outcomes are often analyzed using endpoints such as:

  • Overall Survival (OS): time to death from any cause
  • Progression-Free Survival (PFS): time to disease progression or death

Clinical covariates (e.g., TNM staging system) are often included in models:

  • T: tumor size
  • N: lymph node involvement
  • M: metastasis

These factors may be used as prognostic variables in survival models.


Learn from the papers

Reading Guide (Survival Analysis Papers)

Study & Endpoint

  1. What is the primary endpoint? How is it defined (event, time origin)?
  2. What are the secondary endpoints?

Population 3. What analysis population is used (ITT, per-protocol, subgroup)?

Censoring & Follow-up 4. What are the censoring rules? 5. What is the follow-up duration (median or range)?

Methods 6. What statistical methods are used (KM, log-rank, Cox)? 7. Is the analysis stratified? If yes, by what factors?

Results Interpretation 8. What is the hazard ratio (HR) and 95% CI? 9. How do the Kaplan–Meier curves behave (separation, crossing, convergence)? 10. Does the proportional hazards assumption appear reasonable?

Subgroup & Robustness 11. Are subgroup effects consistent? 12. Are interaction tests performed?

Clinical Interpretation 13. Is the effect clinically meaningful? 14. Any key limitations or biases?

Paper Summary 1

Reference: Pembrolizumab versus Chemotherapy for PD-L1–Positive Non–Small-Cell Lung Cancer [1]

CategoryItemSummary
StudyDiseaseAdvanced NSCLC
PopulationPD-L1 ≥ 50%
DesignRandomized controlled trial
ComparisonPembrolizumab vs Chemotherapy
EndpointsPrimaryProgression-Free Survival (PFS)
SecondaryOverall Survival (OS), Objective Response Rate (ORR), Safety
CensoringRulesCensored if alive without progression at last follow-up or lost to follow-up
AssumptionAssume non-informative censoring
MethodsSurvival estimationKaplan–Meier
Group comparisonStratified log-rank test
Effect estimationCox proportional hazards model
ResultsHR (PFS)0.50 (95% CI: 0.37–0.68)
P-value< 0.001
InterpretationThe Pembrolizumab group has ~50% lower risk of progression or death
KM CurvePatternEarly and sustained separation
PH assumptionReasonable (no crossing)
ConclusionPembrolizumab consistently superior
SubgroupConsistencyGenerally consistent across subgroups
LimitationSome wide CIs; no clear effect modification
ClinicalInterpretationStrong, clinically meaningful benefit
SignalClean survival signal (early + sustained separation)
NotesLimitationsSubgroups exploratory; PH not formally tested; OS may be immature

Paper Summary 2

Reference: Trastuzumab Deruxtecan in Previously Treated HER2-Low Advanced Breast Cancer [2]

CategoryItemSummary
StudyDiseaseUnresectable or metastatic HER2-low breast cancer
PopulationPatients with HER2-low disease, defined as IHC 1+ or IHC 2+/ISH-negative, previously treated with 1 or 2 lines of chemotherapy in the metastatic setting; 494 patients had hormone receptor (HR)-positive disease and 63 had HR-negative disease
DesignPhase 3, randomized, open-label trial
ComparisonTrastuzumab deruxtecan vs physician’s choice of chemotherapy
EndpointsPrimaryProgression-free survival (PFS) by blinded independent central review in the HR-positive cohort
SecondaryOverall survival (OS) in the HR-positive cohort; PFS in all patients; OS in all patients; objective response; safety
CensoringRulesFor PFS, patients without documented progression or death were censored at the last adequate tumor assessment; sensitivity analyses also examined choices such as not censoring at new anticancer therapy, handling progression after missed assessments, and alternative censoring for randomized-but-untreated patients
AssumptionAssume non-informative censoring
MethodsSurvival estimationKaplan–Meier
Group comparisonStratified log-rank test
Effect estimationStratified Cox proportional-hazards model
Stratification factorsHER2 IHC status (1+ vs 2+/ISH-negative), number of prior chemotherapy lines in metastatic disease (1 vs 2), and HR/CDK4/6 status
ResultsHR (PFS, primary endpoint)0.51 (95% CI: 0.40–0.64) in the HR-positive cohort
P-value< 0.0001
InterpretationThe trastuzumab deruxtecan group had about a 49% lower hazard of progression or death than the chemotherapy group in the HR-positive cohort
Median PFS10.1 months vs 5.4 months in the HR-positive cohort
Key OS resultHR for OS in the HR-positive cohort: 0.64 (95% CI: 0.48–0.86; P = 0.0028)
KM CurvePatternEarly and sustained separation favoring trastuzumab deruxtecan
PH assumptionAppears reasonable from the reported KM curves; no major crossing emphasized in the main report
ConclusionTrastuzumab deruxtecan consistently outperformed chemotherapy on PFS and OS
SubgroupConsistencyBenefit was generally consistent across prespecified subgroups
LimitationThe HR-negative subgroup was small, so those results are exploratory and less precise
ClinicalInterpretationStrong, clinically meaningful improvement in both PFS and OS
SignalClear efficacy signal with improvement in both the primary endpoint and key secondary survival endpoints
NotesLimitationsOpen-label design; HR-negative subgroup underpowered for firm conclusions; PH assumption was not formally highlighted in the main paper; HRQoL was not powered for definitive conclusions

Paper Summary 3

Reference: Osimertinib in Untreated EGFR-Mutated Advanced Non–Small-Cell Lung Cancer [3]

CategoryItemSummary
StudyDiseaseAdvanced (locally advanced or metastatic) NSCLC
PopulationTreatment-naïve patients with EGFR-mutated NSCLC (exon 19 deletion or L858R)
DesignPhase 3, randomized, double-blind controlled trial
ComparisonOsimertinib vs standard EGFR-TKI (gefitinib or erlotinib)
EndpointsPrimaryProgression-Free Survival (PFS) (investigator-assessed)
SecondaryOverall Survival (OS), Objective Response Rate (ORR), Duration of Response, Safety
CensoringRulesPatients without progression or death were censored at the date of last tumor assessment; censoring also applied for patients starting new anticancer therapy before progression
AssumptionAssumes non-informative censoring (implicit)
MethodsSurvival estimationKaplan–Meier
Group comparisonStratified log-rank test (by mutation type [exon 19 vs L858R] and race [Asian vs non-Asian])
Effect estimationStratified Cox proportional hazards model
ResultsHR (PFS)0.46 (95% CI: 0.37–0.57)
P-value< 0.001
InterpretationThe osimertinib group had about a 54% lower hazard of progression or death compared to standard EGFR-TKI in full analysis set
Median PFS18.9 months vs 10.2 months
KM CurvePatternEarly and sustained separation favoring osimertinib
PH assumptionReasonable (no major crossing observed)
ConclusionOsimertinib consistently outperformed standard EGFR-TKIs in delaying progression
SubgroupConsistencyTreatment benefit consistent across major subgroups (mutation type, race, CNS metastases)
LimitationSome subgroups have wider CIs; subgroup analyses are exploratory
ClinicalInterpretationStrong and clinically meaningful improvement in PFS
SignalLarge magnitude benefit with durable separation of survival curves
NotesLimitationsOS immature at initial publication; crossover and subsequent therapies may confound OS; PH assumption not formally tested

Paper Summary 4

Non-Proportional Hazards Reference: Nivolumab versus Docetaxel in Advanced Nonsquamous Non–Small-Cell Lung Cancer[4]

CategoryItemSummary
StudyDiseaseAdvanced nonsquamous NSCLC
PopulationPatients with advanced NSCLC who had disease progression during or after platinum-based chemotherapy
DesignPhase 3, randomized, open-label trial
ComparisonNivolumab vs Docetaxel
EndpointsPrimaryOverall Survival (OS)
SecondaryObjective Response Rate (ORR), Progression-Free Survival (PFS), Safety
CensoringRulesPatients alive at last follow-up were censored at last known alive date; for PFS, patients without progression or death were censored at last tumor assessment
AssumptionAssumes non-informative censoring (implicit)
MethodsSurvival estimationKaplan–Meier
Group comparisonStratified log-rank test
Effect estimationStratified Cox proportional hazards model
Stratification factorsPD-L1 expression level, prior maintenance therapy
ResultsHR (OS, primary endpoint)0.73 (95% CI: 0.59–0.89)
P-value0.002
InterpretationNivolumab reduced the hazard of death by ~27% compared to docetaxel
Median OS12.2 months vs 9.4 months
KM CurvePatternDelayed separation: curves overlap early, then diverge
PH assumptionLikely violated (non-proportional hazards suggested by delayed effect)
ConclusionNivolumab shows survival benefit despite delayed treatment effect
SubgroupConsistencyGreater benefit observed in patients with higher PD-L1 expression
LimitationSome subgroups have wide CIs; exploratory interpretation
ClinicalInterpretationClinically meaningful OS benefit with improved tolerability vs chemotherapy
SignalDelayed but durable survival benefit characteristic of immunotherapy
NotesLimitationsEvidence of non-proportional hazards; HR represents an average effect over time; alternative methods (e.g., RMST) not used; open-label design

Paper Summary 5

Reference: Pembrolizumab plus Chemotherapy in Metastatic Non–Small-Cell Lung Cancer[5]

CategoryItemSummary
StudyDiseaseMetastatic nonsquamous NSCLC
PopulationPreviously untreated patients with metastatic nonsquamous NSCLC, without EGFR or ALK alterations
DesignPhase 3, randomized, double-blind, placebo-controlled trial
ComparisonPembrolizumab + chemotherapy vs placebo + chemotherapy
EndpointsPrimaryOverall Survival (OS) and Progression-Free Survival (PFS)
SecondaryObjective Response Rate (ORR), Duration of Response, Safety
CensoringRulesPatients without event were censored at last known alive date (OS) or last tumor assessment (PFS); censoring applied for patients without progression or death at cutoff
AssumptionAssumes non-informative censoring (implicit)
MethodsSurvival estimationKaplan–Meier
Group comparisonStratified log-rank test
Effect estimationStratified Cox proportional hazards model
Stratification factorsPD-L1 tumor proportion score (<1% vs 1–49% vs ≥50%), choice of chemotherapy (cisplatin vs carboplatin)
ResultsHR (OS, primary endpoint)0.49 (95% CI: 0.38–0.64)
P-value< 0.001
InterpretationPembrolizumab plus chemotherapy reduced the hazard of death by ~51% compared to chemotherapy alone in the overall population
Median OSNot reached vs 11.3 months (at initial analysis)
HR (PFS)0.52 (95% CI: 0.43–0.64)
KM CurvePatternEarly separation with sustained benefit; slight early overlap possible but no crossing
PH assumptionGenerally reasonable; no strong evidence of violation
ConclusionCombination therapy consistently improved survival outcomes
SubgroupConsistencyBenefit observed across PD-L1 subgroups (including <1%)
LimitationMagnitude of effect varies by PD-L1 expression; subgroup analyses exploratory
ClinicalInterpretationStrong and clinically meaningful improvement in both OS and PFS
SignalRobust benefit across populations, including those with low PD-L1 expression
NotesLimitationsEarly OS data immature (median not reached); subgroup analyses exploratory; PH assumption not formally tested

5-Paper Survival Analysis Comparison Sheet

Paper #TrialDisease / SettingComparisonPrimary Endpoint(s)Main Survival ResultKM PatternPH AssumptionKey Survival Lesson
1reck2016pembrolizumab [1:1]Advanced NSCLC, PD-L1 ≥50%, first-linePembrolizumab vs ChemotherapyPFSHR (PFS) = 0.50 (95% CI: 0.37–0.68), p < 0.001Early and sustained separationReasonableTextbook Kaplan–Meier + Cox example; clean PH case
2modi2022trastuzumab [2:1]Unresectable/metastatic HER2-low breast cancerTrastuzumab deruxtecan vs physician’s choice chemotherapyPFS in HR-positive cohortHR (PFS, HR-positive cohort) = 0.51 (95% CI: 0.40–0.64), p < 0.0001Early and sustained separationReasonableStrong example of defining endpoint and analysis population precisely
3soria2017osimertinib [3:1]EGFR-mutated advanced NSCLC, first-lineOsimertinib vs gefitinib/erlotinibPFSHR (PFS, FAS) = 0.46 (95% CI: 0.37–0.57), p < 0.001Early and durable separationReasonableVery clean targeted-therapy survival result; strong Cox interpretation
4borghaei2015nivolumab [4:1]Advanced nonsquamous NSCLC after platinum chemotherapyNivolumab vs docetaxelOSHR (OS) = 0.73 (95% CI: 0.59–0.89), p = 0.002Delayed separation; early overlapLikely violatedExample where HR is an average over time and may hide delayed immunotherapy effect
5gandhi2018pembrolizumab [5:1]Metastatic nonsquamous NSCLC, untreated, no EGFR/ALK alterationPembrolizumab + chemotherapy vs placebo + chemotherapyOS and PFSHR (OS) = 0.49 (95% CI: 0.38–0.64), p < 0.001; HR (PFS) = 0.52 (95% CI: 0.43–0.64)Early separation with sustained benefit; no major crossingGenerally reasonableExample of subgroup heterogeneity without obvious PH violation

Main Takeaways Across the 5 Papers

ThemeWhat I Learned
Canonical workflowMost papers follow: Kaplan–Meier curves + log-rank test + Cox model + subgroup forest plot
Need to specify endpoint and populationHazard ratios must be tied to a specific endpoint and analysis set (e.g., PFS in FAS, or PFS in HR-positive cohort)
PH can hold cleanlyreck2016pembrolizumab [1:2], modi2022trastuzumab[2:2], and soria2017osimertinib[3:2] are examples where Cox HR is straightforward to interpret
HR can be imperfectborghaei2015nivolumab[4:2] shows delayed effect and likely non-proportional hazards, so HR is only an average summary
Subgroup heterogeneity vs non-PHgandhi2018pembrolizumab[5:2] shows subgroup differences (PD-L1), which do not necessarily imply PH violation
Regulatory-style reportingSurvival results are typically reported using HR, 95% CI, p-value, median survival, and KM plots—even when assumptions are imperfect

  1. https://www.nejm.org/doi/full/10.1056/NEJMoa1606774open in new window ↩︎ ↩︎ ↩︎

  2. https://www.nejm.org/doi/full/10.1056/NEJMoa2203690open in new window ↩︎ ↩︎ ↩︎

  3. https://www.nejm.org/doi/full/10.1056/NEJMoa1713137open in new window ↩︎ ↩︎ ↩︎

  4. https://www.nejm.org/doi/full/10.1056/NEJMoa1507643open in new window ↩︎ ↩︎ ↩︎

  5. https://www.nejm.org/doi/full/10.1056/NEJMoa1801005open in new window ↩︎ ↩︎ ↩︎