Title: | Causal Analysis of Observational Time-to-Event Data |
Version: | 0.0.4.5 |
Description: | Implements target trial emulation methods to apply randomized clinical trial design and analysis in an observational setting. Using marginal structural models, it can estimate intention-to-treat and per-protocol effects in emulated trials using electronic health records. A description and application of the method can be found in Danaei et al (2013) <doi:10.1177/0962280211403603>. |
License: | Apache License (≥ 2) |
URL: | https://causal-lda.github.io/TrialEmulation/, https://github.com/Causal-LDA/TrialEmulation |
BugReports: | https://github.com/Causal-LDA/TrialEmulation/issues |
Depends: | R (≥ 4.1.0) |
Imports: | broom (≥ 0.7.10), checkmate, data.table (≥ 1.9.8), DBI, duckdb, formula.tools, lifecycle, lmtest, methods, mvtnorm, parglm, Rcpp, sandwich |
Suggests: | knitr, parsnip, rmarkdown, rpart, testthat (≥ 3.0.0), withr |
LinkingTo: | Rcpp |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
Language: | en-GB |
LazyData: | true |
LazyDataCompression: | xz |
RoxygenNote: | 7.3.2.9000 |
Collate: | 'RcppExports.R' 'calculate_weights.R' 'data.R' 'data_extension.R' 'data_manipulation.R' 'data_preparation.R' 'data_simulation.R' 'data_utils.R' 'expand_trials.R' 'generics.R' 'initiators.R' 'lr_utils.R' 'te_model_fitter.R' 'te_outcome_model.R' 'te_datastore.R' 'te_weights.R' 'te_data.R' 'te_expansion.R' 'trial_sequence.R' 'modelling.R' 'package.R' 'predict.R' 'robust.R' 'sampling.R' 'te_datastore_csv.R' 'te_datastore_duckdb.R' 'te_parsnip.R' 'te_stats_glm_logit.R' 'utils.R' 'weighting.R' |
NeedsCompilation: | yes |
Packaged: | 2025-06-13 07:54:25 UTC; root |
Author: | Isaac Gravestock |
Maintainer: | Isaac Gravestock <isaac.gravestock@roche.com> |
Repository: | CRAN |
Date/Publication: | 2025-06-13 10:10:08 UTC |
TrialEmulation
Package
Description
TrialEmulation
facilitates preprocessing and analysing observational data
an emulated a randomised trial.
Author(s)
Maintainer: Isaac Gravestock isaac.gravestock@roche.com (ORCID)
Authors:
Li Su li.su@mrc-bsu.cam.ac.uk
Roonak Rezvani roonak.r74@gmail.com (ORCID) (Original package author)
Julia Moesch julia.moesch@roche.com
Other contributors:
Medical Research Council (MRC) [funder]
F. Hoffmann-La Roche AG [copyright holder, funder]
See Also
Useful links:
Report bugs at https://github.com/Causal-LDA/TrialEmulation/issues
Calculate and transform predictions
Description
Calculate and transform predictions
Usage
calculate_predictions(
newdata,
model,
treatment_values,
pred_fun,
coefs_mat,
matrix_n_col
)
Arguments
newdata |
New data to predict outcome |
model |
GLM object |
treatment_values |
Named vector of value to insert into |
pred_fun |
Function to transform prediction matrix |
coefs_mat |
Matrix of coefficients corresponding to |
matrix_n_col |
Expected number of column after prediction. |
Value
A matrix with transformed predicted values. Number of columns corresponds
to the number of rows of coefs_mat
Calculate Inverse Probability of Censoring Weights
Description
Usage
calculate_weights(object, ...)
## S4 method for signature 'trial_sequence_ITT'
calculate_weights(object, quiet = FALSE)
## S4 method for signature 'trial_sequence_AT'
calculate_weights(object, quiet = FALSE)
## S4 method for signature 'trial_sequence_PP'
calculate_weights(object, quiet = FALSE)
Arguments
object |
A trial_sequence object |
... |
Other arguments used by methods. |
quiet |
Prints model summaries is |
Value
A trial_sequence object with updated censor_weights
and/or switch_weights
slots
Examples
save_dir <- file.path(tempdir(), "switch_models")
ts <- trial_sequence("PP") |>
set_data(
data = data_censored,
id = "id",
period = "period",
treatment = "treatment",
outcome = "outcome",
eligible = "eligible"
) |>
set_switch_weight_model(
numerator = ~ age + x1 + x3,
denominator = ~age,
model_fitter = stats_glm_logit(save_path = save_dir)
) |>
calculate_weights()
Case-control sampling of expanded data for the sequence of emulated trials
Description
Usage
case_control_sampling_trials(
data_prep,
p_control = NULL,
subset_condition,
sort = FALSE
)
Arguments
data_prep |
Result from |
p_control |
Control sampling probability for selecting potential controls at each follow-up time of each trial. |
subset_condition |
Expression used to |
sort |
Sort data before applying case-control sampling to make sure that the resulting data are identical when
sampling from the expanded data created with |
Details
Perform case-control sampling of expanded data to create a data set of reduced size and calculate sampling weights
to be used in trial_msm()
.
Value
A data.frame
or a split()
data.frame
if length(p_control) > 1
. An additional column sample_weight
containing the sample weights will be added to the result. These can be included in the models fit with
trial_msm()
.
Examples
# If necessary reduce the number of threads for data.table
data.table::setDTthreads(2)
data("te_data_ex")
samples <- case_control_sampling_trials(te_data_ex, p_control = 0.01)
Censoring Function in C++
Description
Artificial censoring C++ function
Arguments
sw_data |
A dataframe with the columns needed in censoring process |
Check Data used for Prediction
Description
Check Data used for Prediction
Usage
check_newdata(newdata, model, predict_times)
Arguments
newdata |
new data to predict, or missing. |
model |
glm model object. |
predict_times |
times to predict to add to resulting newdata. |
Value
A newdata
data.frame
Example of longitudinal data for sequential trial emulation containing censoring
Description
This data contains data from 89 patients followed for up to 19 periods.
Usage
data_censored
Format
A data frame with 725 rows and 12 variables:
- id
patient identifier
- period
time period
- treatment
indicator for receiving treatment in this period, 1=treatment, 0=non-treatment
- x1
A time-varying categorical variable relating to treatment and the outcome
- x2
A time-varying numeric variable relating to treatment and the outcome
- x3
A fixed categorical variable relating to treatment and the outcome
- x4
A fixed categorical variable relating to treatment and the outcome
- age
patient age in years
- age_s
patient age
- outcome
indicator for outcome in this period, 1=event occurred, 0=no event
- censored
indicator for patient being censored in this period, 1=censored, 0=not censored
- eligible
indicator for eligibility for trial start in this period, 1=yes, 0=no
Data Manipulation Function
Description
This function takes the data and all the variables and does the extension preprocessing
Usage
data_manipulation(data, use_censor = TRUE)
Arguments
data |
|
use_censor |
apply censoring due to treatment switch? |
Prepare data for the sequence of emulated target trials
Description
Usage
data_preparation(
data,
id = "id",
period = "period",
treatment = "treatment",
outcome = "outcome",
eligible = "eligible",
model_var = NULL,
outcome_cov = ~1,
estimand_type = c("ITT", "PP", "As-Treated"),
switch_n_cov = ~1,
switch_d_cov = ~1,
first_period = NA,
last_period = NA,
use_censor_weights = FALSE,
cense = NA,
pool_cense = c("none", "both", "numerator"),
cense_d_cov = ~1,
cense_n_cov = ~1,
eligible_wts_0 = NA,
eligible_wts_1 = NA,
where_var = NULL,
data_dir,
save_weight_models = FALSE,
glm_function = "glm",
chunk_size = 500,
separate_files = FALSE,
quiet = FALSE,
...
)
Arguments
data |
A |
id |
Name of the variable for identifiers of the individuals. Default is ‘id’. |
period |
Name of the variable for the visit/period. Default is ‘period’. |
treatment |
Name of the variable for the treatment indicator at that visit/period. Default is ‘treatment’. |
outcome |
Name of the variable for the indicator of the outcome event at that visit/period. Default is ‘outcome’. |
eligible |
Name of the variable for the indicator of eligibility for the target trial at that visit/period. Default is ‘eligible’. |
model_var |
Treatment variables to be included in the marginal structural model for the emulated trials.
|
outcome_cov |
A RHS formula with baseline covariates to be adjusted for in the marginal structural model for the
emulated trials. Note that if a time-varying covariate is specified in |
estimand_type |
Specify the estimand for the causal analyses in the sequence of emulated trials. |
switch_n_cov |
A RHS formula to specify the logistic models for estimating the numerator terms of the inverse
probability of treatment weights. A derived variable named |
switch_d_cov |
A RHS formula to specify the logistic models for estimating the denominator terms of the inverse probability of treatment weights. |
first_period |
First time period to be set as trial baseline to start expanding the data. |
last_period |
Last time period to be set as trial baseline to start expanding the data. |
use_censor_weights |
Require the inverse probability of censoring weights. If |
cense |
Variable name for the censoring indicator. Required if |
pool_cense |
Fit pooled or separate censoring models for those treated and those untreated at the immediately
previous visit. Pooling can be specified for the models for the numerator and denominator terms of the inverse
probability of censoring weights. One of |
cense_d_cov |
A RHS formula to specify the logistic models for estimating the denominator terms of the inverse probability of censoring weights. |
cense_n_cov |
A RHS formula to specify the logistic models for estimating the numerator terms of the inverse probability of censoring weights. |
eligible_wts_0 |
See definition for |
eligible_wts_1 |
Exclude some observations when fitting the models for the inverse probability of treatment
weights. For example, if it is assumed that an individual will stay on treatment for at least 2 visits, the first 2
visits after treatment initiation by definition have a probability of staying on the treatment of 1.0 and should
thus be excluded from the weight models for those who are on treatment at the immediately previous visit. Users can
define a variable that indicates that these 2 observations are ineligible for the weight model for those who are on
treatment at the immediately previous visit and add the variable name in the argument |
where_var |
Specify the variable names that will be used to define subgroup conditions when fitting the marginal
structural model for a subgroup of individuals. Need to specify jointly with the argument |
data_dir |
Directory to save model objects when |
save_weight_models |
Save model objects for estimating the weights in |
glm_function |
Specify which glm function to use for the marginal structural model from the |
chunk_size |
Number of individuals whose data to be processed in one chunk when |
separate_files |
Save expanded data in separate CSV files for each trial. |
quiet |
Suppress the printing of progress messages and summaries of the fitted models. |
... |
Additional arguments passed to |
Details
This function expands observational data in the person-time format (i.e., the ‘long’ format) to emulate a sequence of target trials and also estimates the inverse probability of treatment and censoring weights as required.
The arguments chunk_size
and separate_files
allow for processing of large datasets that would not fit in
memory once expanded. When separate_files = TRUE
, the input data are processed in chunks of individuals and saved
into separate files for each emulated trial. These separate files can be sampled by case-control sampling to create
a reduced dataset for the modelling.
Value
An object of class TE_data_prep
, which can either be sampled from (case_control_sampling_trials) or
directly used in a model (trial_msm). It contains the elements
- data
the expanded dataset for all emulated trials. If
separate_files = FALSE
, it is adata.table
; ifseparate_files = TRUE
, it is a character vector with the file path of the expanded data as CSV files.- min_period
index for the first trial in the expanded data
- max_period
index for the last trial in the expanded data
- N
the total number of observations in the expanded data
- data_template
a zero-row
data.frame
with the columns and attributes of the expanded data- switch_models
a list of summaries of the models fitted for inverse probability of treatment weights, if
estimand_type
is"PP"
or"As-Treated"
- censor_models
a list of summaries of the models fitted for inverse probability of censoring weights, if
use_censor_weights=TRUE
- args
a list contain the parameters used to prepare the data and fit the weight models
Expand Function
Description
This function performs the data expansion for a given dataset
Usage
expand(
sw_data,
outcomeCov_var,
where_var,
use_censor,
maxperiod,
minperiod,
keeplist
)
Arguments
sw_data |
datatable to expand |
outcomeCov_var |
A list of individual baseline variables used in final model |
where_var |
Variables used in where conditions used in subsetting the data used in final analysis (where_case), the variables not included in the final model |
use_censor |
Use censoring for per-protocol analysis - censor person-times once a person-trial stops taking the initial treatment value |
maxperiod |
Maximum period |
minperiod |
Minimum period |
keeplist |
A list contains names of variables used in final model |
Expand trials
Description
Usage
expand_trials(object)
Arguments
object |
A trial_sequence object |
Value
The trial_sequence object
with a data set containing the full sequence of target trials. The data is
stored according to the options set with set_expansion_options()
and especially the save_to_*
function.
Check Expand Flag After Treatment Switch
Description
Check if patients have switched treatment in eligible trials
and set expand = 0
.
Usage
expand_until_switch(s, n)
Arguments
s |
numeric vector where |
n |
length of s |
Value
Vector of indicator values up until first switch.
Fit the marginal structural model for the sequence of emulated trials
Description
Usage
fit_msm(
object,
weight_cols = c("weight", "sample_weight"),
modify_weights = NULL
)
## S4 method for signature 'trial_sequence'
fit_msm(
object,
weight_cols = c("weight", "sample_weight"),
modify_weights = NULL
)
Arguments
object |
A |
weight_cols |
character vector of column names in expanded outcome dataset, ie |
modify_weights |
a function to transform the weights (or Before the outcome marginal structural model can be fit, the outcome model must be specified with
The model is fit based on the |
Value
A modified trial_sequence
object with updated outcome_model
slot.
Examples
trial_seq_object <- trial_sequence("ITT") |>
set_data(data_censored) |>
set_outcome_model(
adjustment_terms = ~age_s,
followup_time_terms = ~ stats::poly(followup_time, degree = 2)
) |>
set_expansion_options(output = save_to_datatable(), chunk_size = 500) |>
expand_trials() |>
load_expanded_data()
fit_msm(trial_seq_object)
# Using modify_weights functions ----
# returns a function that truncates weights to limits
limit_weight <- function(lower_limit, upper_limit) {
function(w) {
w[w > upper_limit] <- upper_limit
w[w < lower_limit] <- lower_limit
w
}
}
# calculate 1st and 99th percentile limits and truncate
p99_weight <- function(w) {
p99 <- quantile(w, prob = c(0.01, 0.99), type = 1)
limit_weight(p99[1], p99[2])(w)
}
# set all weights to 1
all_ones <- function(w) {
rep(1, length(w))
}
fit_msm(trial_seq_object, modify_weights = limit_weight(0.01, 4))
fit_msm(trial_seq_object, modify_weights = p99_weight)
Method for fitting outcome models
Description
Method for fitting outcome models
Usage
fit_outcome_model(object, data, formula, weights = NULL)
Arguments
object |
A |
data |
|
formula |
|
weights |
|
Value
An object of class te_outcome_fitted
Method for fitting weight models
Description
Method for fitting weight models
Usage
fit_weights_model(object, data, formula, label)
Arguments
object |
The object determining which method should be used, containing any slots containing user defined parameters. |
data |
|
formula |
|
label |
A short string describing the model. |
Value
An object of class te_weights_fitted
Examples
fitter <- stats_glm_logit(tempdir())
data(data_censored)
# Not usually called directly by a user
fitted <- fit_weights_model(
object = fitter,
data = data_censored,
formula = 1 - censored ~ x1 + age_s + treatment,
label = "Example model for censoring"
)
fitted
unlink(fitted@summary$save_path$path)
A wrapper function to perform data preparation and model fitting in a sequence of emulated target trials
Description
Usage
initiators(
data,
id = "id",
period = "period",
treatment = "treatment",
outcome = "outcome",
eligible = "eligible",
outcome_cov = ~1,
estimand_type = c("ITT", "PP", "As-Treated"),
model_var = NULL,
switch_n_cov = ~1,
switch_d_cov = ~1,
first_period = NA,
last_period = NA,
first_followup = NA,
last_followup = NA,
use_censor_weights = FALSE,
save_weight_models = FALSE,
analysis_weights = c("asis", "unweighted", "p99", "weight_limits"),
weight_limits = c(0, Inf),
cense = NA,
pool_cense = c("none", "both", "numerator"),
cense_d_cov = ~1,
cense_n_cov = ~1,
include_followup_time = ~followup_time + I(followup_time^2),
include_trial_period = ~trial_period + I(trial_period^2),
eligible_wts_0 = NA,
eligible_wts_1 = NA,
where_var = NULL,
where_case = NA,
data_dir,
glm_function = "glm",
quiet = FALSE,
...
)
Arguments
data |
A |
id |
Name of the variable for identifiers of the individuals. Default is ‘id’. |
period |
Name of the variable for the visit/period. Default is ‘period’. |
treatment |
Name of the variable for the treatment indicator at that visit/period. Default is ‘treatment’. |
outcome |
Name of the variable for the indicator of the outcome event at that visit/period. Default is ‘outcome’. |
eligible |
Name of the variable for the indicator of eligibility for the target trial at that visit/period. Default is ‘eligible’. |
outcome_cov |
A RHS formula with baseline covariates to be adjusted for in the marginal structural model for the
emulated trials. Note that if a time-varying covariate is specified in |
estimand_type |
Specify the estimand for the causal analyses in the sequence of emulated trials. |
model_var |
Treatment variables to be included in the marginal structural model for the emulated trials.
|
switch_n_cov |
A RHS formula to specify the logistic models for estimating the numerator terms of the inverse
probability of treatment weights. A derived variable named |
switch_d_cov |
A RHS formula to specify the logistic models for estimating the denominator terms of the inverse probability of treatment weights. |
first_period |
First time period to be set as trial baseline to start expanding the data. |
last_period |
Last time period to be set as trial baseline to start expanding the data. |
first_followup |
First follow-up time/visit in the trials to be included in the marginal structural model for the outcome event. |
last_followup |
Last follow-up time/visit in the trials to be included in the marginal structural model for the outcome event. |
use_censor_weights |
Require the inverse probability of censoring weights. If |
save_weight_models |
Save model objects for estimating the weights in |
analysis_weights |
Choose which type of weights to be used for fitting the marginal structural model for the outcome event.
|
weight_limits |
Lower and upper limits to truncate weights, given as |
cense |
Variable name for the censoring indicator. Required if |
pool_cense |
Fit pooled or separate censoring models for those treated and those untreated at the immediately
previous visit. Pooling can be specified for the models for the numerator and denominator terms of the inverse
probability of censoring weights. One of |
cense_d_cov |
A RHS formula to specify the logistic models for estimating the denominator terms of the inverse probability of censoring weights. |
cense_n_cov |
A RHS formula to specify the logistic models for estimating the numerator terms of the inverse probability of censoring weights. |
include_followup_time |
The model to include the follow up time/visit of the trial ( |
include_trial_period |
The model to include the trial period ( |
eligible_wts_0 |
See definition for |
eligible_wts_1 |
Exclude some observations when fitting the models for the inverse probability of treatment
weights. For example, if it is assumed that an individual will stay on treatment for at least 2 visits, the first 2
visits after treatment initiation by definition have a probability of staying on the treatment of 1.0 and should
thus be excluded from the weight models for those who are on treatment at the immediately previous visit. Users can
define a variable that indicates that these 2 observations are ineligible for the weight model for those who are on
treatment at the immediately previous visit and add the variable name in the argument |
where_var |
Specify the variable names that will be used to define subgroup conditions when fitting the marginal
structural model for a subgroup of individuals. Need to specify jointly with the argument |
where_case |
Define conditions using variables specified in |
data_dir |
Directory to save model objects in. |
glm_function |
Specify which glm function to use for the marginal structural model from the |
quiet |
Suppress the printing of progress messages and summaries of the fitted models. |
... |
Additional arguments passed to |
Details
An all-in-one analysis using a sequence of emulated target trials. This provides a simplified interface to the main
functions data_preparation()
and trial_msm()
.
Value
Returns the result of trial_msm()
from the expanded data. An object of class TE_msm
containing
- model
a
glm
object- robust
a list containing a summary table of estimated regression coefficients and the robust covariance matrix
Internal Methods
Description
Various S4 methods which are not directly for use by users.
Usage
## S4 method for signature 'trial_sequence,data.frame'
set_data(
object,
data,
censor_at_switch,
id = "id",
period = "period",
treatment = "treatment",
outcome = "outcome",
eligible = "eligible"
)
## S4 method for signature 'trial_sequence'
set_expansion_options(
object,
output,
chunk_size = 0,
first_period = 0,
last_period = Inf,
censor_at_switch
)
## S4 method for signature 'trial_sequence'
expand_trials(object)
Arguments
object |
A trial_sequence object |
data |
A |
id |
Name of the variable for identifiers of the individuals. Default is <U+2018>id<U+2019>. |
period |
Name of the variable for the visit/period. Default is <U+2018>period<U+2019>. |
treatment |
Name of the variable for the treatment indicator at that visit/period. Default is <U+2018>treatment<U+2019>. |
outcome |
Name of the variable for the indicator of the outcome event at that visit/period. Default is <U+2018>outcome<U+2019>. |
eligible |
Name of the variable for the indicator of eligibility for the target trial at that visit/period. Default is <U+2018>eligible<U+2019>. |
IPW Data Accessor and Setter
Description
Usage
ipw_data(object)
ipw_data(object) <- value
## S4 method for signature 'trial_sequence'
ipw_data(object)
## S4 replacement method for signature 'trial_sequence'
ipw_data(object) <- value
Arguments
object |
|
value |
|
Details
Generic function to access and update the data used for inverse probability weighting.
The setter method ipw_data(object) <- value
does not perform the same checks and manipulations
as set_data()
. To completely replace the data please use set_data()
. This ipw_data<-
method allows
small changes such as adding a new column.
Value
The data from the @data
slot of object
used for inverse probability weighting.
Examples
ts <- trial_sequence("ITT")
ts <- set_data(ts, data_censored)
ipw_data(ts)
data.table::set(ipw_data(ts), j = "dummy", value = TRUE)
# or with the setter method:
new_data <- ipw_data(ts)
new_data$x2sq <- new_data$x2^2
ipw_data(ts) <- new_data
Method to read, subset and sample expanded data
Description
Usage
load_expanded_data(
object,
p_control = NULL,
period = NULL,
subset_condition = NULL,
seed = NULL
)
## S4 method for signature 'trial_sequence'
load_expanded_data(
object,
p_control = NULL,
period = NULL,
subset_condition = NULL,
seed = NULL
)
Arguments
object |
An object of class trial_sequence. |
p_control |
Probability of selecting a control, |
period |
An integerish vector of non-zero length to select trial period(s) or |
subset_condition |
A string or The operators Note: Make sure numeric vectors written as |
seed |
An integer seed or Note: The same seed will return a different result depending on the class of the te_datastore object contained in the trial_sequence object. |
Details
This method is used on trial_sequence objects to read, subset and sample expanded data.
Value
An updated trial_sequence object, the data is stored in slot @outcome_data
as a te_outcome_data object.
Examples
# create a trial_sequence-class object
trial_itt_dir <- file.path(tempdir(), "trial_itt")
dir.create(trial_itt_dir)
trial_itt <- trial_sequence(estimand = "ITT") |>
set_data(data = data_censored) |>
set_outcome_model(adjustment_terms = ~ x1 + x2)
trial_itt_csv <- set_expansion_options(
trial_itt,
output = save_to_csv(file.path(trial_itt_dir, "trial_csvs")),
chunk_size = 500
) |>
expand_trials()
# load_expanded_data default behaviour returns all trial_periods and doesn't sample
load_expanded_data(trial_itt_csv)
# load_expanded_data can subset the data before sampling
load_expanded_data(
trial_itt_csv,
p_control = 0.2,
period = 1:20,
subset_condition = "followup_time %in% 1:20 & x2 < 1",
)
# delete after use
unlink(trial_itt_dir, recursive = TRUE)
Outcome Data Accessor and Setter
Description
Usage
outcome_data(object)
outcome_data(object) <- value
## S4 method for signature 'trial_sequence'
outcome_data(object)
## S4 replacement method for signature 'trial_sequence'
outcome_data(object) <- value
Arguments
object |
|
value |
|
Details
Generic function to outcome data
Value
The object
with updated outcome data
Examples
ts <- trial_sequence("ITT")
new_data <- data.table::data.table(vignette_switch_data[1:200, ])
new_data$weight <- 1
outcome_data(ts) <- new_data
Fit outcome models using parsnip
models
Description
Usage
parsnip_model(model_spec, save_path)
Arguments
model_spec |
A |
save_path |
Directory to save models. Set to |
Details
Specify that the models should be fit using a classification model specified with the parsnip
package.
Warning: This functionality is experimental and not recommended for use in analyses.
sqrt{n}
-consistency estimation and valid inference of the parameters in marginal structural models for
emulated trials generally require that the weights for treatment switching and censoring be estimated at parametric
rates, which is generally not possible when using data-adaptive estimation of high-dimensional regressions.
Therefore, we only recommend using stats_glm_logit()
.
Value
An object of class te_parsnip_model
inheriting from te_model_fitter which is used for
dispatching methods for the fitting models.
See Also
Other model_fitter:
stats_glm_logit()
,
te_model_fitter-class
Examples
## Not run:
if (
requireNamespace("parsnip", quietly = TRUE) &&
requireNamespace("rpart", quietly = TRUE)
) {
# Use a decision tree model fitted with the rpart package
parsnip_model(
model_spec = parsnip::decision_tree(tree_depth = 30) |>
set_mode("classification") |>
set_engine("rpart"),
save_path = tempdir()
)
}
## End(Not run)
Predict marginal cumulative incidences with confidence intervals for a target trial population
Description
This function predicts the marginal cumulative incidences when a target trial population receives either the
treatment or non-treatment at baseline (for an intention-to-treat analysis) or either sustained treatment or
sustained non-treatment (for a per-protocol analysis). The difference between these cumulative incidences is the
estimated causal effect of treatment. Currently, the
predict
function only provides marginal intention-to-treat and
per-protocol effects, therefore it is only valid when estimand_type = "ITT"
or estimand_type = "PP"
.
Usage
predict(object, ...)
## S4 method for signature 'trial_sequence_ITT'
predict(
object,
newdata,
predict_times,
conf_int = TRUE,
samples = 100,
type = c("cum_inc", "survival")
)
## S4 method for signature 'trial_sequence_PP'
predict(
object,
newdata,
predict_times,
conf_int = TRUE,
samples = 100,
type = c("cum_inc", "survival")
)
## S3 method for class 'TE_msm'
predict(
object,
newdata,
predict_times,
conf_int = TRUE,
samples = 100,
type = c("cum_inc", "survival"),
...
)
Arguments
object |
Object from |
... |
Further arguments passed to or from other methods. |
newdata |
Baseline trial data that characterise the target trial population that marginal cumulative incidences
or survival probabilities are predicted for. |
predict_times |
Specify the follow-up visits/times where the marginal cumulative incidences or survival probabilities are predicted. |
conf_int |
Construct the point-wise 95-percent confidence intervals of cumulative incidences for the target trial population under treatment and non-treatment and their differences by simulating the parameters in the marginal structural model from a multivariate normal distribution with the mean equal to the marginal structural model parameter estimates and the variance equal to the estimated robust covariance matrix. |
samples |
Number of samples used to construct the simulation-based confidence intervals. |
type |
Specify cumulative incidences or survival probabilities to be predicted. Either cumulative incidence
( |
Value
A list of three data frames containing the cumulative incidences for each of the assigned treatment options (treatment and non-treatment) and the difference between them.
Examples
# Prediction for initiators() or trial_msm() objects -----
# If necessary set the number of `data.table` threads
data.table::setDTthreads(2)
data("te_model_ex")
predicted_ci <- predict(te_model_ex, predict_times = 0:30, samples = 10)
# Plot the cumulative incidence curves under treatment and non-treatment
plot(predicted_ci[[1]]$followup_time, predicted_ci[[1]]$cum_inc,
type = "l",
xlab = "Follow-up Time", ylab = "Cumulative Incidence",
ylim = c(0, 0.7)
)
lines(predicted_ci[[1]]$followup_time, predicted_ci[[1]]$`2.5%`, lty = 2)
lines(predicted_ci[[1]]$followup_time, predicted_ci[[1]]$`97.5%`, lty = 2)
lines(predicted_ci[[2]]$followup_time, predicted_ci[[2]]$cum_inc, type = "l", col = 2)
lines(predicted_ci[[2]]$followup_time, predicted_ci[[2]]$`2.5%`, lty = 2, col = 2)
lines(predicted_ci[[2]]$followup_time, predicted_ci[[2]]$`97.5%`, lty = 2, col = 2)
legend("topleft", title = "Assigned Treatment", legend = c("0", "1"), col = 1:2, lty = 1)
# Plot the difference in cumulative incidence over follow up
plot(predicted_ci[[3]]$followup_time, predicted_ci[[3]]$cum_inc_diff,
type = "l",
xlab = "Follow-up Time", ylab = "Difference in Cumulative Incidence",
ylim = c(0.0, 0.5)
)
lines(predicted_ci[[3]]$followup_time, predicted_ci[[3]]$`2.5%`, lty = 2)
lines(predicted_ci[[3]]$followup_time, predicted_ci[[3]]$`97.5%`, lty = 2)
Print a weight summary object
Description
Usage
## S3 method for class 'TE_weight_summary'
print(x, full = TRUE, ...)
Arguments
x |
print TE_weight_summary object. |
full |
Print full or short summary. |
... |
Arguments passed to print.data.frame. |
Value
No return value, only for printing.
Method to read expanded data
Description
This method is used on te_datastore objects to read selected data and return one data.table
.
Usage
read_expanded_data(object, period = NULL, subset_condition = NULL)
## S4 method for signature 'te_datastore_datatable'
read_expanded_data(object, period = NULL, subset_condition = NULL)
Arguments
object |
An object of class te_datastore. |
period |
An integerish vector of non-zero length to select trial period(s) or |
subset_condition |
A string of length 1 or |
Value
A data.frame
of class data.table
.
Examples
# create a te_datastore_csv object and save some data
temp_dir <- tempfile("csv_dir_")
dir.create(temp_dir)
datastore <- save_to_csv(temp_dir)
data(vignette_switch_data)
expanded_csv_data <- save_expanded_data(datastore, vignette_switch_data[1:200, ])
# read expanded data
read_expanded_data(expanded_csv_data)
# delete after use
unlink(temp_dir, recursive = TRUE)
Robust Variance Calculation
Description
This function performs the calculation of robust standard errors based on
variances estimated using sandwich::vcovCL
.
Usage
robust_calculation(model, data_id)
Arguments
model |
The logistic regression model object. |
data_id |
Values of id column of the data (ie |
Value
A list with elements summary
, a table with the model summary using the
robust variance estimates, and matrix
, the sandwich
covariance matrix.
Internal method to sample expanded data
Description
Internal method to sample expanded data
Usage
sample_expanded_data(
object,
p_control,
period = NULL,
subset_condition = NULL,
seed
)
## S4 method for signature 'te_datastore'
sample_expanded_data(
object,
p_control,
period = NULL,
subset_condition = NULL,
seed
)
Arguments
object |
An object of class te_datastore. |
p_control |
Probability of selecting a control. |
period |
An integerish vector of non-zero length to select trial period(s) or |
subset_condition |
A string or |
seed |
An integer seed or |
Value
A data.frame
of class data.table
.
Examples
# Data object normally created by [expand_trials]
datastore <- new("te_datastore_datatable", data = te_data_ex$data, N = 50139L)
sample_expanded_data(datastore, period = 260:275, p_control = 0.2, seed = 123)
Method to save expanded data
Description
This method is used internally by expand_trials to save the data to the "datastore" defined in set_expansion_options.
Usage
save_expanded_data(object, data)
## S4 method for signature 'te_datastore_datatable'
save_expanded_data(object, data)
Arguments
object |
An object of class te_datastore or a child class. |
data |
A data frame containing the expanded trial data. The columns |
Value
An updated object
with the data stored. Notably object@N
should be increased
Examples
temp_dir <- tempfile("csv_dir_")
dir.create(temp_dir)
datastore <- save_to_csv(temp_dir)
data(vignette_switch_data)
save_expanded_data(datastore, vignette_switch_data[1:200, ])
# delete after use
unlink(temp_dir, recursive = TRUE)
Save expanded data as CSV
Description
Usage
save_to_csv(path)
Arguments
path |
Directory to save CSV files in. Must be empty. |
Value
A te_datastore_csv object.
See Also
Other save_to:
save_to_datatable()
,
save_to_duckdb()
,
set_expansion_options()
Examples
csv_dir <- file.path(tempdir(), "expanded_trials_csv")
dir.create(csv_dir)
csv_datastore <- save_to_csv(path = csv_dir)
trial_to_expand <- trial_sequence("ITT") |>
set_data(data = data_censored) |>
set_expansion_options(output = csv_datastore, chunk_size = 500)
# Delete directory after use
unlink(csv_dir)
Save expanded data as a data.table
Description
Usage
save_to_datatable()
See Also
Other save_to:
save_to_csv()
,
save_to_duckdb()
,
set_expansion_options()
Examples
trial_to_expand <- trial_sequence("ITT") |>
set_data(data = data_censored) |>
set_expansion_options(output = save_to_datatable(), chunk_size = 500)
Save expanded data to DuckDB
Description
Usage
save_to_duckdb(path)
Arguments
path |
Directory to save |
Value
A te_datastore_duckdb object.
See Also
Other save_to:
save_to_csv()
,
save_to_datatable()
,
set_expansion_options()
Examples
if (require(duckdb)) {
duckdb_dir <- file.path(tempdir(), "expanded_trials_duckdb")
trial_to_expand <- trial_sequence("ITT") |>
set_data(data = data_censored) |>
set_expansion_options(output = save_to_duckdb(path = duckdb_dir), chunk_size = 500)
# Delete directory after use
unlink(duckdb_dir)
}
Select Data Columns
Description
Select the required columns from the data and rename
Usage
select_data_cols(data, args)
Arguments
data |
A |
args |
List of arguments from |
Value
A data.table with the required columns
Set censoring weight model
Description
Usage
set_censor_weight_model(
object,
censor_event,
numerator,
denominator,
pool_models = NULL,
model_fitter
)
## S4 method for signature 'trial_sequence'
set_censor_weight_model(
object,
censor_event,
numerator,
denominator,
pool_models = c("none", "both", "numerator"),
model_fitter = stats_glm_logit()
)
## S4 method for signature 'trial_sequence_PP'
set_censor_weight_model(
object,
censor_event,
numerator,
denominator,
pool_models = "none",
model_fitter = stats_glm_logit()
)
## S4 method for signature 'trial_sequence_ITT'
set_censor_weight_model(
object,
censor_event,
numerator,
denominator,
pool_models = "numerator",
model_fitter = stats_glm_logit()
)
## S4 method for signature 'trial_sequence_AT'
set_censor_weight_model(
object,
censor_event,
numerator,
denominator,
pool_models = "none",
model_fitter = stats_glm_logit()
)
Arguments
object |
trial_sequence. |
censor_event |
string. Name of column containing censoring indicator. |
numerator |
A RHS formula to specify the logistic models for estimating the numerator terms of the inverse probability of censoring weights. |
denominator |
A RHS formula to specify the logistic models for estimating the denominator terms of the inverse probability of censoring weights. |
pool_models |
Fit pooled or separate censoring models for those treated and those untreated at the immediately previous visit. Pooling can be specified for the models for the numerator and denominator terms of the inverse probability of censoring weights. One of "none", "numerator", or "both" (default is "none" except when estimand = "ITT" then default is "numerator"). |
model_fitter |
An object of class |
Value
object
is returned with @censor_weights
set
Examples
trial_sequence("ITT") |>
set_data(data = data_censored) |>
set_censor_weight_model(
censor_event = "censored",
numerator = ~ age_s + x1 + x3,
denominator = ~ x3 + x4,
pool_models = "both",
model_fitter = stats_glm_logit(save_path = tempdir())
)
Set the trial data
Description
Usage
set_data(object, data, ...)
## S4 method for signature 'trial_sequence_ITT,data.frame'
set_data(
object,
data,
id = "id",
period = "period",
treatment = "treatment",
outcome = "outcome",
eligible = "eligible"
)
## S4 method for signature 'trial_sequence_AT,data.frame'
set_data(
object,
data,
id = "id",
period = "period",
treatment = "treatment",
outcome = "outcome",
eligible = "eligible"
)
## S4 method for signature 'trial_sequence_PP,data.frame'
set_data(
object,
data,
id = "id",
period = "period",
treatment = "treatment",
outcome = "outcome",
eligible = "eligible"
)
Arguments
object |
A trial_sequence object |
data |
A |
... |
Other arguments used by methods internally. |
id |
Name of the variable for identifiers of the individuals. Default is <U+2018>id<U+2019>. |
period |
Name of the variable for the visit/period. Default is <U+2018>period<U+2019>. |
treatment |
Name of the variable for the treatment indicator at that visit/period. Default is <U+2018>treatment<U+2019>. |
outcome |
Name of the variable for the indicator of the outcome event at that visit/period. Default is <U+2018>outcome<U+2019>. |
eligible |
Name of the variable for the indicator of eligibility for the target trial at that visit/period. Default is <U+2018>eligible<U+2019>. |
Value
An updated trial_sequence object with data
Examples
data(trial_example)
trial_sequence("ITT") |>
set_data(
data = trial_example,
id = "id",
period = "period",
eligible = "eligible",
treatment = "treatment"
)
Set expansion options
Description
Usage
set_expansion_options(object, ...)
## S4 method for signature 'trial_sequence_ITT'
set_expansion_options(
object,
output,
chunk_size,
first_period = 0,
last_period = Inf
)
## S4 method for signature 'trial_sequence_PP'
set_expansion_options(
object,
output,
chunk_size,
first_period = 0,
last_period = Inf
)
## S4 method for signature 'trial_sequence_ITT'
set_expansion_options(
object,
output,
chunk_size,
first_period = 0,
last_period = Inf
)
Arguments
object |
A trial_sequence object |
... |
Arguments used in methods |
output |
A te_datastore object as created by a |
chunk_size |
An integer specifying the number of patients to include in each expansion iteration |
first_period |
An integer specifying the first period to include in the expansion |
last_period |
An integer specifying the last period to include in the expansion |
Value
object
is returned with @expansion
set
See Also
Other save_to:
save_to_csv()
,
save_to_datatable()
,
save_to_duckdb()
Examples
output_dir <- file.path(tempdir(check = TRUE), "expanded_data")
ITT_trial <- trial_sequence("ITT") |>
set_data(data = data_censored) |>
set_expansion_options(output = save_to_csv(output_dir), chunk_size = 500)
# Delete directory
unlink(output_dir, recursive = TRUE)
Specify the outcome model
Description
The time-to-event model for outcome
is specified with this method. Any adjustment terms can be specified.
For ITT and PP estimands the treatment_var
is not specified as it is automatically defined as
assigned_treatment
. Importantly, the modelling of "time" is specified in this model with arguments for
trial start time and follow up time within the trial.
Usage
set_outcome_model(object, ...)
## S4 method for signature 'trial_sequence'
set_outcome_model(
object,
treatment_var = ~0,
adjustment_terms = ~1,
followup_time_terms = ~followup_time + I(followup_time^2),
trial_period_terms = ~trial_period + I(trial_period^2),
model_fitter = stats_glm_logit(save_path = NA)
)
## S4 method for signature 'trial_sequence_ITT'
set_outcome_model(
object,
adjustment_terms = ~1,
followup_time_terms = ~followup_time + I(followup_time^2),
trial_period_terms = ~trial_period + I(trial_period^2),
model_fitter = stats_glm_logit(save_path = NA)
)
## S4 method for signature 'trial_sequence_PP'
set_outcome_model(
object,
adjustment_terms = ~1,
followup_time_terms = ~followup_time + I(followup_time^2),
trial_period_terms = ~trial_period + I(trial_period^2),
model_fitter = stats_glm_logit(save_path = NA)
)
## S4 method for signature 'trial_sequence_AT'
set_outcome_model(
object,
treatment_var = "dose",
adjustment_terms = ~1,
followup_time_terms = ~followup_time + I(followup_time^2),
trial_period_terms = ~trial_period + I(trial_period^2),
model_fitter = stats_glm_logit(save_path = NA)
)
Arguments
object |
A trial_sequence object |
... |
Parameters used by methods |
treatment_var |
The treatment term, only used for "as treated" estimands. PP and ITT are fixed to use "assigned_treatment". |
adjustment_terms |
Formula terms for any covariates to adjust the outcome model. |
followup_time_terms |
Formula terms for |
trial_period_terms |
Formula terms for |
model_fitter |
A |
Value
A modified object
with the outcome_model
slot set
Examples
trial_sequence("ITT") |>
set_data(data_censored) |>
set_outcome_model(
adjustment_terms = ~age_s,
followup_time_terms = ~ stats::poly(followup_time, degree = 2)
)
Set switching weight model
Description
Usage
set_switch_weight_model(object, numerator, denominator, model_fitter, ...)
## S4 method for signature 'trial_sequence'
set_switch_weight_model(
object,
numerator,
denominator,
model_fitter,
eligible_wts_0 = NULL,
eligible_wts_1 = NULL
)
## S4 method for signature 'trial_sequence_ITT'
set_switch_weight_model(object, numerator, denominator, model_fitter)
Arguments
object |
A trial_sequence object. |
numerator |
Right hand side formula for the numerator model |
denominator |
Right hand side formula for the denominator model |
model_fitter |
A te_model_fitter object, such as stats_glm_logit |
... |
Other arguments used by methods. |
eligible_wts_0 |
Name of column containing indicator (0/1) for observation to be excluded/included in weight model. |
eligible_wts_1 |
Exclude some observations when fitting the models for the inverse probability of treatment
weights. For example, if it is assumed that an individual will stay on treatment for at least 2 visits, the first 2
visits after treatment initiation by definition have a probability of staying on the treatment of 1.0 and should
thus be excluded from the weight models for those who are on treatment at the immediately previous visit. Users can
define a variable that indicates that these 2 observations are ineligible for the weight model for those who are on
treatment at the immediately previous visit and add the variable name in the argument |
Value
object
is returned with @switch_weights
set
Examples
trial_sequence("PP") |>
set_data(data = data_censored) |>
set_switch_weight_model(
numerator = ~ age_s + x1 + x3,
denominator = ~ x3 + x4,
model_fitter = stats_glm_logit(tempdir())
)
Show Weight Model Summaries
Description
Usage
show_weight_models(object)
Arguments
object |
A trial_sequence object after fitting weight models with |
Value
Prints summaries of the censoring models
Fit outcome models using stats::glm
Description
Usage
stats_glm_logit(save_path)
Arguments
save_path |
Directory to save models. Set to |
Details
Specify that the pooled logistic regression outcome models should be fit using stats::glm with family = binomial(link = "logit")
.
Outcome models additional calculate robust variance estimates using sandwich::vcovCL
.
Value
An object of class te_stats_glm_logit
inheriting from te_model_fitter which is used for
dispatching methods for the fitting models.
See Also
Other model_fitter:
parsnip_model()
,
te_model_fitter-class
Examples
stats_glm_logit(save_path = tempdir())
Summary methods
Description
Print summaries of data and model objects produced by
TrialEmulation
.
Usage
## S3 method for class 'TE_data_prep'
summary(object, ...)
## S3 method for class 'TE_data_prep_sep'
summary(object, ...)
## S3 method for class 'TE_data_prep_dt'
summary(object, ...)
## S3 method for class 'TE_msm'
summary(object, ...)
## S3 method for class 'TE_robust'
summary(object, ...)
Arguments
object |
Object to print summary |
... |
Additional arguments passed to print methods. |
Value
No value, displays summaries of object.
TrialEmulation Data Class
Description
TrialEmulation Data Class
Slots
data
A
data.table
object with columns "id", "period", "treatment", "outcome", "eligible"
Example of a prepared data object
Description
A small example object from data_preparation used in examples. It is created with the following code:
Usage
te_data_ex
Format
An object of class TE_data_prep_dt
(inherits from TE_data_prep
) of length 6.
Details
dat <- trial_example[trial_example$id < 200, ] te_data_ex <- data_preparation( data = dat, outcome_cov = c("nvarA", "catvarA"), first_period = 260, last_period = 280 )
See Also
te_datastore
Description
This is the parent class for classes which define how the expanded trial data should be stored.
To define a new storage type, a new class should be defined which inherits from te_datastore
. In addition, methods
save_expanded_data and read_expanded_data
need to be defined for the new class.
Value
A 'te_datastore' object
Slots
N
The number of observations in this data. Initially 0.
te_datastore_csv, functions and methods
Description
This method is used internally by expand_trials to save the data to the "datastore" defined in set_expansion_options.
Usage
## S4 method for signature 'te_datastore_csv'
save_expanded_data(object, data)
## S4 method for signature 'te_datastore_csv'
read_expanded_data(object, period = NULL, subset_condition = NULL)
## S4 method for signature 'te_datastore_csv'
sample_expanded_data(
object,
p_control,
period = NULL,
subset_condition = NULL,
seed
)
Arguments
object |
An object of class te_datastore or a child class. |
data |
A data frame containing the expanded trial data. The columns |
period |
An integerish vector of non-zero length to select trial period(s) or |
subset_condition |
A string of length 1 or |
Value
A 'te_datastore_csv' object.
Slots
path
path to csv files.
files
names of csv files.
template
data.frame template.
Examples
temp_dir <- tempfile("csv_dir_")
dir.create(temp_dir)
datastore <- save_to_csv(temp_dir)
data(vignette_switch_data)
save_expanded_data(datastore, vignette_switch_data[1:200, ])
# delete after use
unlink(temp_dir, recursive = TRUE)
te_datastore_duckdb, functions and methods
Description
This method is used internally by expand_trials to save the data to the "datastore" defined in set_expansion_options.
Usage
## S4 method for signature 'te_datastore_duckdb'
save_expanded_data(object, data)
## S4 method for signature 'te_datastore_duckdb'
read_expanded_data(object, period = NULL, subset_condition = NULL)
## S4 method for signature 'te_datastore_duckdb'
sample_expanded_data(
object,
p_control,
period = NULL,
subset_condition = NULL,
seed
)
Arguments
object |
An object of class te_datastore or a child class. |
data |
A data frame containing the expanded trial data. The columns |
period |
An integerish vector of non-zero length to select trial period(s) or |
subset_condition |
A string of length 1 or |
p_control |
Probability of selecting a control. |
seed |
An integer seed or |
Value
A 'te_datastore_duckdb' object.
Slots
path
Path to the duckdb file containing the data.
table
.
con
S4 object of class duckdb_connection.
Examples
temp_dir <- tempfile("csv_dir_")
dir.create(temp_dir)
datastore <- save_to_csv(temp_dir)
data(vignette_switch_data)
save_expanded_data(datastore, vignette_switch_data[1:200, ])
# delete after use
unlink(temp_dir, recursive = TRUE)
Example of a fitted marginal structural model object
Description
A small example object from trial_msm used in examples. It is created with the following code:
Usage
te_model_ex
Format
An object of class TE_msm
of length 3.
Details
te_model_ex <- trial_msm( data = data_subset, outcome_cov = c("catvarA", "nvarA"), last_followup = 40, model_var = "assigned_treatment", include_followup_time = ~followup_time, include_trial_period = ~trial_period, use_sample_weights = FALSE, quiet = TRUE, glm_function = "glm" )
See Also
Outcome Model Fitter Class
Description
This is a virtual class which other outcome model fitter classes should inherit from. Objects of these class exist to define how the outcome models are fit. They are used for the dispatch of the internal methods fit_outcome_model, fit_weights_model and predict.
See Also
Other model_fitter:
parsnip_model()
,
stats_glm_logit()
TrialEmulation Outcome Data Class
Description
TrialEmulation Outcome Data Class
Slots
data
A
data.table
object with columns "id", "period",n_rows
Number of rows
n_ids
Number of IDs
periods
Vector of periods "treatment", "outcome", "eligible"
Fitted Outcome Model Object
Description
Fitted Outcome Model Object
Slots
model
list containing fitted model objects.
summary
list of data.frames. Tidy model summaries a la
broom()
andglance()
Fitted Outcome Model Object
Description
Fitted Outcome Model Object
Slots
formula
formula
object for the model fittingadjustment_vars
character. Adjustment variables
treatment_var
Variable used for treatment
stabilised_weights_terms
formula. Adjustment terms from numerator models of stabilised weights. These must be included in the outcome model.
adjustment_terms
formula. User specified terms to include in the outcome model
treatment_terms
formula. Estimand defined treatment term
followup_time_terms
formula. Terms to model follow up time within an emulated trial
trial_period_terms
formula. Terms to model start time ("trial_period") of an emulated trial
model_fitter
Model fitter object
fitted
list. Saves the model objects
Fit Models using parsnip
Description
The classes and (internal) methods defined for using parsnip to fit the weight models.
Usage
## S4 method for signature 'te_parsnip_model'
fit_weights_model(object, data, formula, label)
Arguments
object |
The object determining which method should be used, containing any slots containing user defined parameters. |
data |
|
formula |
|
label |
A short string describing the model. |
Functions
-
fit_weights_model(te_parsnip_model)
: Fit the weight models object via calculate_weights ontrial_sequence
Slots
model_spec
A model specification defined with the
parsnip
package.
See Also
Other model_fitter_classes:
te_stats_glm_logit-class
Fit Models using logistic stats::glm
Description
The classes and (internal) methods defined for using stats::glm to fit the logistic regression models.
Usage
## S4 method for signature 'te_stats_glm_logit'
fit_weights_model(object, data, formula, label)
## S4 method for signature 'te_stats_glm_logit'
fit_outcome_model(object, data, formula, weights = NULL)
## S4 method for signature 'te_stats_glm_logit_outcome_fitted'
predict(
object,
newdata,
predict_times,
conf_int = TRUE,
samples = 100,
type = c("cum_inc", "survival")
)
Arguments
object |
Object to dispatch method on |
data |
|
formula |
|
label |
A short string describing the model. |
weights |
|
newdata |
Baseline trial data that characterise the target trial population that marginal cumulative incidences
or survival probabilities are predicted for. |
predict_times |
Specify the follow-up visits/times where the marginal cumulative incidences or survival probabilities are predicted. |
conf_int |
Construct the point-wise 95-percent confidence intervals of cumulative incidences for the target trial population under treatment and non-treatment and their differences by simulating the parameters in the marginal structural model from a multivariate normal distribution with the mean equal to the marginal structural model parameter estimates and the variance equal to the estimated robust covariance matrix. |
samples |
Number of samples used to construct the simulation-based confidence intervals. |
type |
Specify cumulative incidences or survival probabilities to be predicted. Either cumulative incidence
( |
Functions
-
fit_weights_model(te_stats_glm_logit)
: Fit the weight models object via calculate_weights ontrial_sequence
-
fit_outcome_model(te_stats_glm_logit)
: Fit the outcome model object via fit_msm ontrial_sequence
-
predict(te_stats_glm_logit_outcome_fitted)
: Predict from the fitted model object via predict ontrial_sequence
See Also
Other model_fitter_classes:
te_parsnip_model-class
Example of longitudinal data for sequential trial emulation
Description
A dataset containing the treatment, outcomes and other attributes of 503 patients for sequential trial emulation.
See vignette("Getting-Started")
.
Usage
trial_example
Format
A data frame with 48400 rows and 11 variables:
- id
patient identifier
- eligible
eligible for trial start in this period, 1=yes, 0=no
- period
time period
- outcome
indicator for outcome in this period, 1=event occurred, 0=no event
- treatment
indicator for receiving treatment in this period, 1=treatment, 0=no treatment
- catvarA
A categorical variable relating to treatment and the outcome
- catvarB
A categorical variable relating to treatment and the outcome
- catvarC
A categorical variable relating to treatment and the outcome
- nvarA
A numerical variable relating to treatment and the outcome
- nvarB
A numerical variable relating to treatment and the outcome
- nvarC
A numerical variable relating to treatment and the outcome
Fit the marginal structural model for the sequence of emulated trials
Description
Usage
trial_msm(
data,
outcome_cov = ~1,
estimand_type = c("ITT", "PP", "As-Treated"),
model_var = NULL,
first_followup = NA,
last_followup = NA,
analysis_weights = c("asis", "unweighted", "p99", "weight_limits"),
weight_limits = c(0, Inf),
include_followup_time = ~followup_time + I(followup_time^2),
include_trial_period = ~trial_period + I(trial_period^2),
where_case = NA,
glm_function = c("glm", "parglm"),
use_sample_weights = TRUE,
quiet = FALSE,
...
)
Arguments
data |
A |
outcome_cov |
A RHS formula with baseline covariates to be adjusted for in the marginal structural model for the
emulated trials. Note that if a time-varying covariate is specified in |
estimand_type |
Specify the estimand for the causal analyses in the sequence of emulated trials. |
model_var |
Treatment variables to be included in the marginal structural model for the emulated trials.
|
first_followup |
First follow-up time/visit in the trials to be included in the marginal structural model for the outcome event. |
last_followup |
Last follow-up time/visit in the trials to be included in the marginal structural model for the outcome event. |
analysis_weights |
Choose which type of weights to be used for fitting the marginal structural model for the outcome event.
|
weight_limits |
Lower and upper limits to truncate weights, given as |
include_followup_time |
The model to include the follow up time/visit of the trial ( |
include_trial_period |
The model to include the trial period ( |
where_case |
Define conditions using variables specified in |
glm_function |
Specify which glm function to use for the marginal structural model from the |
use_sample_weights |
Use case-control sampling weights in addition to inverse probability weights for treatment
and censoring. |
quiet |
Suppress the printing of progress messages and summaries of the fitted models. |
... |
Additional arguments passed to |
Details
Apply a weighted pooled logistic regression to fit the marginal structural model for the sequence of emulated trials and calculates the robust covariance matrix of parameter using the sandwich estimator.
The model formula is constructed by combining the arguments outcome_cov
, model_var
,
include_followup_time
, and include_trial_period
.
Value
Object of class TE_msm
containing
- model
a
glm
object- robust
a list containing a summary table of estimated regression coefficients and the robust covariance matrix
- args
a list contain the parameters used to prepare and fit the model
Create a sequence of emulated target trials object
Description
Usage
trial_sequence(estimand, ...)
Arguments
estimand |
The name of the estimand for this analysis, either one of |
... |
Other parameters used when creating object |
Value
An estimand specific trial sequence object
Examples
trial_sequence("ITT")
Trial Sequence class
Description
Trial Sequence class
Slots
data
te_data.
estimand
character. Descriptive name of estimand.
expansion
te_expansion
outcome_model
te_outcome_model.
outcome_data
te_outcome_data.
censor_weight
te_weight. Object to define weighting to account for informative censoring
censor_weight
te_weight. Object to define weighting to account for informative censoring due to treatment switching
Example of expanded longitudinal data for sequential trial emulation
Description
This is the expanded dataset created in the vignette("Getting-Started")
known as switch_data
.
Usage
vignette_switch_data
Format
A data frame with 1939053 rows and 7 variables:
- id
patient identifier
- trial_period
trial start time period
- followup_time
follow up time within trial
- outcome
indicator for outcome in this period, 1=event occurred, 0=no event
- treatment
indicator for receiving treatment in this period, 1=treatment, 0=non-treatment
- assigned_treatment
indicator for assigned treatment at baseline of the trial, 1=treatment, 0=non-treatment
- weight
weights for use with model fitting
- catvarA
A categorical variable relating to treatment and the outcome
- catvarB
A categorical variable relating to treatment and the outcome
- catvarC
A categorical variable relating to treatment and the outcome
- nvarA
A numerical variable relating to treatment and the outcome
- nvarB
A numerical variable relating to treatment and the outcome
- nvarC
A numerical variable relating to treatment and the outcome
Data used in weight model fitting
Description
Usage
weight_model_data_indices(
object,
type = c("switch", "censor"),
model,
set_col = NULL
)
Arguments
object |
A trial_sequence object |
type |
Select a censoring or switching model |
model |
The model name |
set_col |
A character string to specifying a new column to contain indicators for observations used in fitting this model. |
Value
If set_col
is not specified a logical data.table
column is returned. Otherwise
Examples
trial_pp <- trial_sequence("PP") |>
set_data(data_censored) |>
set_switch_weight_model(
numerator = ~age,
denominator = ~ age + x1 + x3,
model_fitter = stats_glm_logit(tempdir())
) |>
calculate_weights()
ipw_data(trial_pp)
show_weight_models(trial_pp)
# get logical column for own processing
i <- weight_model_data_indices(trial_pp, "switch", "d0")
# set column in data
weight_model_data_indices(trial_pp, "switch", "d0", set_col = "sw_d0")
weight_model_data_indices(trial_pp, "switch", "d1", set_col = "sw_d1")
ipw_data(trial_pp)