--- title: "Getting Started with seroreconstruct" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with seroreconstruct} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` ## Overview `seroreconstruct` is an R package for Bayesian inference of influenza virus infection status from serial antibody measurements. It relaxes the traditional 4-fold rise rule by jointly modeling antibody boosting after infection, waning over time, and measurement error — estimating individual-level infection probabilities from hemagglutination inhibition (HAI) titer data. The statistical framework is described in [Tsang et al. (2022) *Nature Communications* 13:1557](https://doi.org/10.1038/s41467-022-29310-8). ## Input data The package requires two data frames: 1. **`inputdata`** — one row per individual with columns: - `age_group`: integer (0 = children, 1 = adults, 2 = older adults) - `start_time`, `end_time`: follow-up period (integer days) - `time1`, `time2`, `time3`: serum collection dates (integer days) - `HAI_titer_1`, `HAI_titer_2`, `HAI_titer3`: HAI titers at each collection 2. **`inputILI`** — daily influenza-like illness activity, with rows corresponding to consecutive days. The row indices must cover the range of `start_time` to `end_time` in `inputdata`. ```{r load-data} library(seroreconstruct) data("inputdata") data("flu_activity") head(inputdata) dim(inputdata) ``` ## Fitting the model The main function `sero_reconstruct()` runs a Bayesian MCMC to estimate infection probabilities and antibody dynamics parameters. For real analyses, use at least 100,000 iterations with appropriate burn-in and thinning. ```{r fit-model, eval = FALSE} fit <- sero_reconstruct(inputdata, flu_activity, n_iteration = 200000, burnin = 100000, thinning = 10) ``` For this vignette, we use a short run for illustration: ```{r fit-short} fit <- sero_reconstruct(inputdata, flu_activity, n_iteration = 2000, burnin = 1000, thinning = 1) fit ``` ## Viewing results The `summary()` method extracts key estimates with 95% credible intervals: ```{r summary} summary(fit) ``` The summary table includes: - **Random error (%)** and **Two-fold error (%)**: measurement error parameters - **Boosting**: fold-increase in antibody titer after infection - **Waning**: fold-decrease in antibody titer per year post-infection - **Infection probability**: overall and stratified by baseline HAI titer ## Visualization ### MCMC diagnostics Check convergence with trace plots and posterior density plots: ```{r diagnostics, fig.height = 8} plot_diagnostics(fit, params = c("random_error", "twofold_error")) ``` ### Antibody trajectories Visualize posterior trajectories for individual participants. Red lines show trajectories where infection occurred; blue lines show trajectories without infection. ```{r trajectory, fig.height = 4} plot_trajectory(fit, id = 1) ``` ### Boosting and waning ```{r boosting-waning, fig.height = 4, fig.width = 9} oldpar <- par(mfrow = c(1, 2)) plot_boosting(fit) plot_waning(fit) par(oldpar) ``` ### Infection probabilities Forest plot of posterior infection probabilities with 95% credible intervals: ```{r infection-prob, fig.height = 3} plot_infection_prob(fit) ``` ## Tables Extract parameter estimates and individual-level results as data frames: ```{r tables} # Model parameter estimates table_parameters(fit) # Per-individual infection probabilities head(table_infections(fit)) ``` ## Subgroup analysis Compare infection rates across groups by fitting independent MCMCs: ```{r group-by, eval = FALSE} fit_by_age <- sero_reconstruct(inputdata, flu_activity, n_iteration = 200000, burnin = 100000, thinning = 10, group_by = ~age_group) summary(fit_by_age) ``` ## Joint model with shared parameters When comparing groups, measurement error is a lab assay property — it is shared across all groups. You can additionally share boosting/waning if the groups are expected to have similar antibody dynamics: ```{r shared, eval = FALSE} # Share all parameters except infection probability fit_joint <- sero_reconstruct(inputdata, flu_activity, n_iteration = 200000, burnin = 100000, thinning = 10, group_by = ~age_group, shared = c("error", "boosting_waning")) summary(fit_joint) ``` ## Simulation Generate synthetic data for power analysis or validation: ```{r simulate} data("para1") data("para2") sim <- simulate_data(inputdata, flu_activity, para1, para2) names(sim) ``` The simulated data can be passed directly to `sero_reconstruct()` for simulation-recovery studies. ## Citation To cite seroreconstruct in publications: > Tsang TK, Perera RAPM, Fang VJ, Wong JY, Shiu EY, So HC, Ip DKM, > Malik Peiris JS, Leung GM, Cowling BJ, Cauchemez S. (2022). > Reconstructing antibody dynamics to estimate the risk of influenza virus > infection. *Nature Communications* 13:1557. > doi: [10.1038/s41467-022-29310-8](https://doi.org/10.1038/s41467-022-29310-8)