--- title: "Advanced Survival Models" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Advanced Survival Models} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE ) ``` ## Introduction `singleEventSurvival()` supports non-parametric, semi-parametric, and parametric survival estimators through a common interface. This vignette shows how to compare those models once a survival dataset has already been prepared from Eunomia. ## Example Input All examples below assume you already created a survival dataset with the internal `addCohortSurvival()` helper and included age and gender columns. ```{r setup} library(OdysseusSurvivalModule) survivalData <- data.frame( subject_id = 1:8, time = c(15, 21, 40, 55, 60, 74, 90, 120), status = c(1, 0, 1, 0, 1, 0, 1, 0), age_years = c(44, 51, 67, 39, 73, 58, 62, 47), gender = c("Female", "Male", "Female", "Male", "Female", "Male", "Female", "Male") ) ``` ## Model Choices Supported values for `model` are: - `"km"` - `"cox"` - `"weibull"` - `"exponential"` - `"lognormal"` - `"loglogistic"` All of them return the same high-level structure: a named list with `data` and `summary` per stratum, plus `overall`. ## Cox Model ```{r cox-model} coxFit <- singleEventSurvival( survivalData = survivalData, timeScale = "days", model = "cox", covariates = c("age_years") ) coxFit[["overall"]]$summary head(coxFit[["overall"]]$data) ``` The Cox path uses covariates when fitting the model, but the returned object is still survival-oriented. It does not expose regression coefficients or hazard ratios. ## Parametric Models ```{r parametric-models} weibullFit <- singleEventSurvival( survivalData = survivalData, timeScale = "days", model = "weibull", covariates = c("age_years") ) lognormalFit <- singleEventSurvival( survivalData = survivalData, timeScale = "days", model = "lognormal", covariates = c("age_years") ) weibullFit[["overall"]]$summary lognormalFit[["overall"]]$summary ``` Parametric models also return a `data` table with estimated survival, hazard, and cumulative hazard evaluated on the observed event-time grid. ## Comparing Models One practical way to compare models is to extract the same summary fields from each fit. ```{r model-comparison} modelNames <- c("km", "cox", "weibull", "lognormal") fits <- lapply(modelNames, function(modelName) { singleEventSurvival( survivalData = survivalData, timeScale = "days", model = modelName, covariates = if (modelName == "km") NULL else c("age_years") ) }) names(fits) <- modelNames comparison <- data.frame( model = names(fits), medianSurvival = vapply(fits, function(x) x[["overall"]]$summary$medianSurvival, numeric(1)), meanSurvival = vapply(fits, function(x) x[["overall"]]$summary$meanSurvival, numeric(1)), stringsAsFactors = FALSE ) comparison ``` ## Stratified Fitting `strata` accepts `"gender"` and `"age_group"`. When both are supplied, the package fits them separately, not as joint interaction strata. ```{r stratified-models} stratifiedFit <- singleEventSurvival( survivalData = survivalData, timeScale = "days", model = "weibull", covariates = c("age_years"), strata = c("gender", "age_group"), ageBreaks = list(c(18, 49), c(50, 64), c(65, Inf)) ) names(stratifiedFit) stratifiedFit[["gender=Female"]]$summary stratifiedFit[["age_group=65+"]]$summary stratifiedFit$logrank_test_gender stratifiedFit$logrank_test_age_group ``` ## Working with Returned Curves Each fitted entry can be plotted from its `data` component. ```{r curve-plot} curveData <- weibullFit[["overall"]]$data plot( curveData$time, curveData$survival, type = "l", xlab = "Time (days)", ylab = "Survival probability", main = "Weibull survival curve" ) ``` ## When to Use Which Model - Use `"km"` for descriptive, assumption-light summaries. - Use `"cox"` when covariates matter but you still want a survival-curve summary. - Use parametric models when you want a fully specified survival shape and smoother predicted curves. ## Summary The advanced usage pattern is mostly about choosing the right `model` value and then extracting comparable summaries from the returned list structure. ``` ## Summary This vignette covered: 1. **Cox Regression** - Interpreting hazard ratios and multiple covariates 2. **Parametric Models** - Weibull, exponential, lognormal, loglogistic 3. **Model Comparison** - When to use each approach 4. **Stratification** - Gender, age, and multi-dimension strata 5. **Visualization** - Plotting and comparing survival curves 6. **Diagnostics** - Checking modeling assumptions For getting started, see the "Getting Started" vignette.