Abstract

Statistical models that allow for departures from strong distributional assumptions on the outcome and accommodate correlated data structures are essential in many applied regression settings. Our technical note presents the R package tramME that implements the mixed-effects extension of the linear transformation models. The model is appealing because it directly parameterizes the (conditional) distribution function and estimates the necessary transformation of the outcome in a data-driven way. As a result, transformation models represent a general and flexible approach to regression modeling of discrete and continuous outcomes. The package tramME builds on existing implementations of transformation models (the mlt and tram packages) as well as the Laplace approximation and automatic differentiation (using the TMB package) to perform fast and efficient likelihood-based estimation and inference in mixed-effects transformation models. The resulting framework can be readily applied to a wide range of regression problems with grouped data structures. Two examples are presented, which demonstrate how the model can be used for modeling correlated outcomes without strict distributional assumptions: 1) A mixed-effects continuous outcome logistic regression for longitudinal data with a bounded response. 2) A flexible parametric proportional hazards model for time-to-event data from a multi-center trial.

Keywords

correlated outcomes, mixed-effects models, R package development, regression, transformation models

Mixed-effects transformation models

The tramME package (Tamási and Hothorn, 2021)1 extends the linear transformation model framework to include random effects. The resulting model is suitable to estimate flexible regression models with correlated outcomes. Formally, mixed-effects transformation models parameterize the conditional distribution function:

\[\begin{align} {\mathbb{P}}\left(Y\leq y \mid {\mathbf{x}}, {\mathbf{u}}, {\boldsymbol{\gamma}}\right) &= F\left(h(y; {\boldsymbol{\vartheta}}) - {\mathbf{x}}^{\top}{\boldsymbol{\beta}}- {\mathbf{u}}^{\top}{\boldsymbol{\gamma}}\right) \qquad {\boldsymbol{\gamma}}\sim {\mathcal{N}}_{q}({\mathbf{0}}, {\boldsymbol{\Sigma}}{}), \end{align}\]

The observations are assumed to be conditionally (on the vector of random effects) independent. The random effects are assumed to be normally distributed on the transformation scale. Because we directly parameterize the conditional CDF of the outcome, likelihood contributions of randomly censored (left, right, interval) and truncated observations can also be easily expressed.

For further details on transformation models see the Supplementary information section. The technical details on the estimation and inference in mixed-effects transformation models are given in the package vignette2.

The tramME package

Main functions

The inverse link function (\(F()\)) and the baseline transformation function (\(h()\)) are specified by calling one of the main model functions of tramME.

Function Name \(F()\) \(h(y;{\boldsymbol{\vartheta}})\)
LmME() Mixed-effects normal linear regression Standard Gaussian Linear basis
BoxCoxME() Non-normal linear / Box-Cox-type / Continuous probit mixed-effects regression Standard Gaussian Bernstein basis
ColrME() Mixed-effects continuous outcome logistic regression Standard logistic Bernstein basis
CoxphME() Mixed-effects parametric Cox / flexible proportional hazards regression Minimum extreme value Bernstein basis
SurvregME() Mixed-effects parametric survival model Multiple options Multiple options
PolrME() Mixed-effects regression models for discrete ordinal outcomes Multiple options Discrete basis
LehmannME() Mixed-effects Lehmann-alternative linear regression Maximum extreme value Bernstein basis

Formula notation

The response variable, strata, the fixed effects and random effects structures are specified through the model formula, which combines the notations used in the packages tram and lme4.

y | s ~ fe + (re1 | g1) + (re2 | g2) + ...
  • y: response variable,
  • s: stratification variable(s),
  • fe: fixed effects terms,
  • (re1 | g1), (re2 | g2): random effects terms (crossed, nested etc., same as in lme4),
  • g1, g2: grouping factors.

Methods

methods(class = "tramME")
##  [1] anova          coef           coef<-         confint        fitmod        
##  [6] logLik         lpterms        model.frame    model.matrix   plot          
## [11] predict        print          ranef          residuals      simulate      
## [16] summary        trafo          VarCorr        varcov         varcov<-      
## [21] variable.names vcov          
## see '?methods' for accessing help and source code

Implementation details

tramME uses the package mlt (Hothorn, 2020)3 to evaluate the basis transformations and to set up the necessary inputs for the estimation.

The TMB package (Template Model Builder, Kristensen, 2016)4 is used to approximate the integral necessary for evaluating the likelihood function (using Laplace approximation) and to calculate its gradients using automatic differentiation. As a result, traME provides a very efficient implementation of maximum likelihood estimation and inference in the class of mixed-effects transformation models.

The planned future extensions of the package include penalized smooth terms and improved post-estimation features (e.g. marginalized predictions).

Examples

The following example analyses are intended to demonstrate two potential applications of the tramME package and showcase its most important features. They are by no means complete analyses. For further examples and complete references, see the package vignette5.

Bounded continuous outcomes

The neck_pain dataset of the ordinalCont package consists of observations from a randomized, placebo-controlled trial of low-level laser therapy for chronic neck pain (reported in Chow et al., 20066; see also?ordinalCont::neck_pain). The variables in the dataset:

  • id: patient ID number (90 patients)
  • vas: Perceived level of neck pain recorded by the patients on a visual analog scale. (continuous, normalized to (0, 1))
  • laser: Treatment indicator for laser therapy (1: active, 2: placebo)
  • time: Measurement time (1: baseline, 2: after 7 weeks, 3: after 12 weeks)
The rajectories of pain levels measured on a visual analog scale (VAS) in the active treatment and placebo-controlled groups.

The rajectories of pain levels measured on a visual analog scale (VAS) in the active treatment and placebo-controlled groups.

The outcome is bounded and the conditional normality assumption of a normal linear mixed model is not likely to hold. For this reason, we estimate the effect of the laser treatment over time by specifying the following continuous outcome logistic regression model: \[\begin{align*} {\mathbb{P}}\left(\tt{pain} \leq y \mid \tt{laser}, \tt{time}, \alpha_{i}\right) &= \text{expit}\left(h(y) + \beta_{\text{Active}} + \beta_{\text{7w}} + \beta_{\text{12w}} + \beta_{\text{7w,Active}} + \beta_{\text{12w, Active}} + \alpha_{i} \right) \qquad \alpha_{i} \sim {\mathcal{N}}(0,\tau^{2}), \end{align*}\] i.e. \(F()\) is set to the ‘expit’ function, \(h()\) is approximated with a flexible smooth function.

neck_tr <- ColrME(vas ~ laser * time + (1 | id), data = neck_pain, order = 8,
                  bounds = c(0, 1), support = c(0, 1))

We explicitly set the boundaries of the outcome with bounds and support. With order=8, we specify the order of polynomials in Bernstein form used to approximate the baseline transformation.

summary(neck_tr)
## 
## Mixed-effects Continuous Outcome Logistic Regression Model
## 
##  Formula: vas ~ laser * time + (1 | id)
## 
##  Fitted to dataset neck_pain  
## 
##  Fixed effects parameters:
##  =========================
## 
##              Estimate Std. Error z value  Pr(>|z|)    
## laser1       -2.34330    0.70339 -3.3314  0.000864 ***
## time2        -0.65379    0.37832 -1.7282  0.083959 .  
## time3        -0.20559    0.36957 -0.5563  0.578020    
## laser1:time2  4.94961    0.62640  7.9016 2.753e-15 ***
## laser1:time3  3.75360    0.57691  6.5064 7.695e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Random effects:
##  ===============
## 
## Grouping factor: id (90 levels)
## Standard deviation:
## (Intercept) 
##      2.7206 
## 
## 
##  Log-likelihood: 86.36643 (df = 15)

The estimated coefficients of this model are interpreted on the log-odds scale. The results indicate some evidence for baseline imbalance and significant treatment benefits at follow-up (lase1:time2 and laser1:time3 interaction terms).

To compare the marginal distributions of the treatment and control groups, we have to average over the distribution of the random effects. The results below are obtained with the predict method of tramME and the adaptive quadrature method implemented by stats::integrate. For comparison with the fitted (marginal) distribution functions in the various groups, we also plot the empirical cumulative distribution functions (shown by the step functions).

Comparison of marginal distributions, calculated from a mixed-effects continuous outcome logistic regression model, in the treatment (Active) and control (Placebo) groups at baseline and the two follow-up times. The step functions represent the empirical cumulative distribution functions of the specific groups.

Comparison of marginal distributions, calculated from a mixed-effects continuous outcome logistic regression model, in the treatment (Active) and control (Placebo) groups at baseline and the two follow-up times. The step functions represent the empirical cumulative distribution functions of the specific groups.

Time-to-event outcomes

Because the mixed-effects transformation model framework is able to accommodate non-normal data, and (randomly) censored outcomes are allowed in tramME, we can use the package to analyze grouped time-to-event outcomes.

As an example, we will use the eortc dataset of the coxme package, which contains simulated data that replicates the structure of the outcome of a breast cancer trial by European Organization for Research and Treatment of Cancer (see ?coxme::eortc for full details and reference). The variables in the dataset:

  • y: survival time
  • uncens: censoring status (0: right censored, 1: dead)
  • center: enrolling center (37 centers)
  • trt: treatment arm (0: control, 1: treatment)

We fit a fully parametric version of the Cox proportional hazards model with random intercepts for the centers, and center-level variability in treatment effects. \[\begin{align*} {\mathbb{P}}\left(Y\leq y\mid \tt{trt}, a_{i}, b_{i}\right) &= 1 - \exp\left[-\exp\left\{h(y) + a_{i} + (\beta_{\tt{trt}} + b_{i})\tt{trt} \right\}\right] \qquad a_{i} \sim {\mathcal{N}}(0, \tau_{1}^{2}), \quad b_{i} \sim {\mathcal{N}}(0, \tau_{2}^{2}), \end{align*}\] with \(\tt{trt}\) as the treatment indicator. In this specification, the inverse link function is set to the CDF of the minimum extreme value distribution, and the baseline transformation function, which is now the log-cumulative baseline hazard, is approximated with polynomials in the Bernstein form.

data("eortc", package = "coxme")
eortc$trt <- factor(eortc$trt, levels = c(0, 1))
eortc_cp <- CoxphME(Surv(y, uncens) ~ trt + (1 | center/trt), data = eortc, log_first = TRUE, order = 10)

The treatment effect is interpreted in this model on the log-hazard scale.

summary(eortc_cp)
## 
## Mixed-effects Parametric Cox Regression Model
## 
##  Formula: Surv(y, uncens) ~ trt + (1 | center/trt)
## 
##  Fitted to dataset eortc  
## 
##  Fixed effects parameters:
##  =========================
## 
##      Estimate Std. Error z value  Pr(>|z|)    
## trt1 0.753458   0.085189  8.8446 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
##  Random effects:
##  ===============
## 
## Grouping factor: trt:center (74 levels)
## Standard deviation:
## (Intercept) 
##     0.20843 
## 
## Grouping factor: center (37 levels)
## Standard deviation:
## (Intercept) 
##     0.25431 
## 
## 
##  Log-likelihood: -13027.2 (df = 14)

The 95% confidence interval of the hazard ratio of the treatment and control groups is

exp(confint(eortc_cp, parm = "trt1", estimate = TRUE))
##          lwr      upr      est
## trt1 1.79767 2.510357 2.124334

To check the plausibility of the proportional hazards assumption of the treatment effects, we estimate a stratified model, where the treatment effect is allowed to change over time.

eortc_cp2 <- CoxphME(Surv(y, uncens) | trt ~ 1 + (1 | center/trt), data = eortc, log_first = TRUE, order = 10)

The separate cumulative hazard functions of the treatment and control groups are parallel, which indicates that the treatment effect is constant over time.

Comparison of baseline transformation functions in treatment and control groups.

Comparison of baseline transformation functions in treatment and control groups.

Supplementary information

In this section we demonstrate the intuition behind transformation models. For a more technical and detailed exposition of the topic, see Hothorn et al. (2018)7.

Transformation models (unconditional case)

We first discuss the unconditional case using the geyser dataset (see ?TH.data::geyser).

We are interested in modeling the distribution function of an (at least ordered) outcome as

\[ {\mathbb{P}}(Y \leq y) = F(h(y)), \]

where \(F()\) (inverse link function) is a known and fixed function, and \(h()\) is a monotonic increasing function that is estimated from the data.

Intuitively, we map the outcome of interest to a scale where it is ‘well-behaved’, and we estimate the function for this from the data.

Linear transformation models (conditional case)

To include covariates we assume a linear structure on the transformed scale:

\[ {\mathbb{P}}(Y \leq y \mid {\mathbf{X}}={\mathbf{x}}) = F(h(y) - {\mathbf{x}}^{\top}{\boldsymbol{\beta}}). \]

  • The the choice inverse link function (F) determines the interpretation of the coefficients (e.g., standard logistic CDF – log-odds scale; minimum extreme value CDF – log-hazard scale).

  • The (baseline) transformation function is chosen based on the type of the outcome variable and assumptions we make in the model (e.g., step function for discrete outcome; linear, log-linear, or general smooth function for continuous outcomes etc.). \(h()\) has to be an increasing function.

  • Various combinations of \(h()\) and \(F()\) result in various parametric regression models.

This model can be further extended to include strata-specific transformation functions:

\[ {\mathbb{P}}(Y \leq y \mid S = s, {\mathbf{X}}={\mathbf{x}}) = F(h(y \mid s) - {\mathbf{x}}^{\top}{\boldsymbol{\beta}}), \]

where \(s\) is a grouping factor.

Availability of supporting source code and requirements

  • Project name: tramME: Transformation Models with Mixed Effects
  • Project home page: https://CRAN.R-project.org/package=tramME
  • Operating system(s): Platform independent
  • Programming language: R
  • Other requirements: R 3.6.0 or higher, mlt 1.1-0 or higher, tram 0.3-2 or higher, TMB 1.7.15 or higher
  • License: GNU GPL

Data availability

The datasets used in this technical note are available from the R packages

  • ordinalCont
  • coxme
  • TH.data

Declarations

The results presented in this technical note are part of a manuscript submitted for publication8.

Funding

This work was supported by Swiss National Science Foundation, grant number 200021_184603.





  1. Tamási, Bálint, Torsten Hothorn: “tramME: Mixed-Effects Transformation Models Using Template Model Builder.” Manuscript submitted for publication (under revision) to The R Journal (2021) url:https://cran.r-project.org/web/packages/tramME/vignettes/tramME.pdf

  2. vignette("tramME", package = "tramME")

  3. Hothorn, Torsten. “Most Likely Transformations: The mlt Package.” Journal of Statistical Software 92, no. 1 (2020): 1–68. doi:10.18637/jss.v092.i01.

  4. Kristensen, Kasper, Anders Nielsen, Casper W. Berg, Hans Skaug, and Bradley M. Bell. “TMB : Automatic Differentiation and Laplace Approximation.” Journal of Statistical Software 70, no. 5 (2016). doi:10.18637/jss.v070.i05.

  5. vignette("tramME", package = "tramME")

  6. Chow, Roberta T., Gillian Z. Heller, and Les Barnsley. “The Effect of 300 MW, 830 Nm Laser on Chronic Neck Pain: A Double-Blind, Randomized, Placebo-Controlled Study.” Pain 124, no. 1–2 (2006): 201–10. doi:10.1016/j.pain.2006.05.018.

  7. Hothorn, Torsten, Lisa Möst, and Peter Bühlmann. “Most Likely Transformations.” Scandinavian Journal of Statistics 45, no. 1 (2018): 110–34. doi:10.1111/sjos.12291.

  8. Tamási, Bálint, Torsten Hothorn: “tramME: Mixed-Effects Transformation Models Using Template Model Builder.” Manuscript submitted for publication (under revision) to The R Journal (2021) url:https://cran.r-project.org/web/packages/tramME/vignettes/tramME.pdf