Type: | Package |
Title: | Rank-Based Test to Evaluate a Surrogate Marker |
Version: | 2.0 |
Description: | Uses a novel rank-based nonparametric approach to evaluate a surrogate marker in a small sample size setting. Details are described in Parast et al (2024) <doi:10.1093/biomtc/ujad035> and Hughes A et al (2025) <doi:10.48550/arXiv.2502.03030>. A tutorial for this package can be found at https://www.laylaparast.com/surrogaterank and a Shiny App implementing the package can be found at https://parastlab.shinyapps.io/SurrogateRankApp/. |
License: | GPL-2 | GPL-3 [expanded from: GPL] |
Encoding: | UTF-8 |
Imports: | stats,dplyr,ggplot2,pbmcapply |
Suggests: | roxygen2 |
RoxygenNote: | 7.3.2 |
Config/testthat/edition: | 3 |
NeedsCompilation: | no |
Packaged: | 2025-05-20 13:25:46 UTC; parastlm |
Author: | Layla Parast [aut, cre], Arthur Hughes [aut] |
Maintainer: | Layla Parast <parast@austin.utexas.edu> |
Depends: | R (≥ 3.5.0) |
Repository: | CRAN |
Date/Publication: | 2025-05-20 13:40:02 UTC |
Calculates the rank-based test statistic for Y and S and the difference, delta
Description
Calculates the rank-based test statistic for Y and the rank-based test statistic for S and the difference, delta, along with corresponding standard error estimates
Usage
delta.calculate(full.data = NULL, yone = NULL, yzero = NULL, sone = NULL, szero = NULL)
Arguments
full.data |
either full.data or yone, yzero, sone, szero must be supplied; if full data is supplied it must be in the following format: one observation per row, Y is in the first column, S is in the second column, treatment group (0 or 1) is in the third column. |
yone |
primary outcome, Y, in group 1 |
yzero |
primary outcome, Y, in group 0 |
sone |
surrogate marker, S, in group 1 |
szero |
surrogate marker, S, in group 0 |
Value
u.y |
rank-based test statistic for Y |
u.s |
rank-based test statistic for S |
delta |
difference, u.y-u.s |
sd.u.y |
standard error estimate of u.y |
sd.u.s |
standard error estimate of u.s |
sd.delta |
standard error estimate of delta |
Author(s)
Layla Parast
Examples
data(example.data)
delta.calculate(yone = example.data$y1, yzero = example.data$y0, sone = example.data$s1,
szero = example.data$s0)
Calculates the rank-based test statistic for Y and S and the difference, delta, accomodating paired data and allowing for a two-sided test
Description
This function calculates the difference in treatment effects on a univariate marker
and on a continuous primary response. This extends the delta.calculate()
function to the case where samples may be paired instead of
independent, and where a two sided test is desired.
Usage
delta.calculate.extension(yone, yzero, sone, szero, paired = FALSE)
Arguments
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group
with dimension |
szero |
matrix or dataframe of surrogate candidates in the untreated group
with dimension |
paired |
logical flag giving if the data is independent or paired. If
|
Details
This function estimates the difference (delta
) between two rank-based statistics
(e.g., Wilcoxon statistics or paired ranks) for a primary outcome and a surrogate,
under either an independent or paired design.
Value
A list with the following elements:
u.y
: Rank-based test statistic for the primary outcomeu.s
: Rank-based test statistic for the surrogatedelta.estimate
: Estimated difference between outcome and surrogate statisticssd.u.y
: Standard deviation of the outcome statisticsd.u.s
: Standard deviation of the surrogate statisticsd.delta
: Standard error of the delta estimate
Author(s)
Arthur Hughes, Layla Parast
Examples
# Load data
data("example.data")
yone <- example.data$y1
yzero <- example.data$y0
sone <- example.data$s1
szero <- example.data$s0
delta.calculate.extension.result <- delta.calculate.extension(
yone, yzero, sone, szero,
paired = TRUE
)
Estimated power to detect a valid surrogate
Description
Calculates the estimated power to detect a valid surrogate given a total sample size and specified alternative
Usage
est.power(n.total, rho = 0.8, u.y.alt, delta.alt, power.want.s = 0.7)
Arguments
n.total |
total sample size in study |
rho |
rank correlation between Y and S in group 0, default is 0.8 |
u.y.alt |
specified alternative for u.y |
delta.alt |
specified alternative for u.s |
power.want.s |
desired power for u.s, default is 0.7 |
Value
estimated power
Author(s)
Layla Parast
Examples
est.power(n.total = 50, rho = 0.8, u.y.alt=0.9, delta.alt = 0.1)
Example data
Description
Example data use to illustrate the functions
Usage
data("example.data")
Format
A list with 4 elements representing 25 observations from a treatment group (group 1) and 25 observations from a control group (group 0):
y1
the primary outcome,Y, in group 1
y0
the primary outcome, Y, in group 0
s1
the surrogate marker, S, in group 1
s0
the surrogate marker, S, in group 0
Examples
data(example.data)
Example data for the high-dimensional functions
Description
A simulated high‑dimensional dataset for demonstrating the RISE methodology implemented in this package. The data contains primary response and 1000 surrogate candidates from 25 treated individuals and 25 untreated individuals, where 10% of the surrogate candidates are "valid".
Usage
data("example.data.highdim")
Format
A list containing :
- y1
primary response in treated
- y0
primary response in untreated
- s1
1000 surrogate candidates in treated
- s0
1000 surrogate candidates in untreated
- hyp
for each surrogate,
null false
if the surrogate is valid (note that this is from simulated data and is used to demonstrate the method; this would be unknown in practice)
Source
Simulated for package examples.
Examples
data("example.data.highdim")
Performs the evaluation stage of RISE: Two-Stage Rank-Based Identification of High-Dimensional Surrogate Markers
Description
A set of high-dimensional surrogate candidates are evaluated jointly. Strength of surrogacy is assessed through a rank-based measure of the similarity in treatment effects on a candidate surrogate and the primary response.
Usage
rise.evaluate(
yone,
yzero,
sone,
szero,
alpha = 0.05,
power.want.s = NULL,
epsilon = NULL,
u.y.hyp = NULL,
p.correction = "BH",
n.cores = 1,
alternative = "less",
paired = FALSE,
return.all.evaluate = TRUE,
return.plot.evaluate = TRUE,
evaluate.weights = TRUE,
screening.weights = NULL,
markers = NULL
)
Arguments
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group
with dimension |
szero |
matrix or dataframe of surrogate candidates in the untreated group
with dimension |
alpha |
significance level for determining surrogate candidates. Default is
|
power.want.s |
numeric in (0,1) - power desired for a test of treatment effect based
on the surrogate candidate. Either this or |
epsilon |
numeric in (0,1) - non-inferiority margin for determining surrogate
validity. Either this or |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
p.correction |
character. Method for p-value adjustment (see |
n.cores |
numeric giving the number of cores to commit to parallel computation
in order to improve computational time through the |
alternative |
character giving the alternative hypothesis type. One of
|
paired |
logical flag giving if the data is independent or paired. If
|
return.all.evaluate |
logical flag. If |
return.plot.evaluate |
logical flag. If |
evaluate.weights |
logical flag. If |
screening.weights |
dataframe with columns |
markers |
a vector of marker names (column names of szero and sone) to evaluate. If not given, will default to evaluating all markers in the dataframes. |
Value
A list with:
-
individual.metrics
Ifreturn.all.evaluate = TRUE
, a dataframe of evaluation results for each significant marker. -
gamma.s
A list with elementsgamma.s.one
andgamma.s.zero
, giving the combined surrogate marker in the treated and untreated groups, respectively. -
gamma.s.evaluate
A dataframe giving the evaluation ofgamma.s
. -
gamma.s.plot
A ggplot2 plot showinggamma.s
against the primary response on the rank-scale.
Author(s)
Arthur Hughes
Examples
# Load high-dimensional example data
data("example.data.highdim")
yone <- example.data.highdim$y1
yzero <- example.data.highdim$y0
sone <- example.data.highdim$s1
szero <- example.data.highdim$s0
rise.evaluate.result <- rise.evaluate(yone, yzero, sone, szero, power.want.s = 0.8)
Perform the screening stage of RISE: Two-Stage Rank-Based Identification of High-Dimensional Surrogate Markers
Description
A set of high-dimensional surrogate candidates are screened one-by-one to identify strong candidates. Strength of surrogacy is assessed through a rank-based measure of the similarity in treatment effects on a candidate surrogate and the primary response. P-values corresponding to hypothesis testing on this measure are corrected for the high number of statistical tests performed.
Usage
rise.screen(
yone,
yzero,
sone,
szero,
alpha = 0.05,
power.want.s = NULL,
epsilon = NULL,
u.y.hyp = NULL,
p.correction = "BH",
n.cores = 1,
alternative = "less",
paired = FALSE,
return.all.screen = TRUE
)
Arguments
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group with dimension
|
szero |
matrix or dataframe of surrogate candidates in the untreated group with dimension
|
alpha |
significance level for determining surrogate candidates. Default is |
power.want.s |
numeric in (0,1) - power desired for a test of treatment effect based on the
surrogate candidate. Either this or |
epsilon |
numeric in (0,1) - non-inferiority margin for determining surrogate validity. Either
this or |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
p.correction |
character. Method for p-value adjustment (see |
n.cores |
numeric giving the number of cores to commit to parallel computation in order to
improve computational time through the |
alternative |
character giving the alternative hypothesis type. One of
|
paired |
logical flag giving if the data is independent or paired. If |
return.all.screen |
logical flag. If |
Value
a list with elements
-
screening.metrics
: dataframe of screening results (for each candidate marker - delta, CI, sd, epsilon, p-values). -
significant.markers
: character vector of markers withp_adjusted < alpha
-
screening.weights
: dataframe giving marker names and the inverse absolute value of the associated deltas.
Author(s)
Arthur Hughes
Examples
# Load high-dimensional example data
data("example.data.highdim")
yone <- example.data.highdim$y1
yzero <- example.data.highdim$y0
sone <- example.data.highdim$s1
szero <- example.data.highdim$s0
rise.screen.result <- rise.screen(yone, yzero, sone, szero, power.want.s = 0.8)
Tests whether the surrogate is valid
Description
Calculates the rank-based test statistic for Y and the rank-based test statistic for S and the difference, delta, along with corresponding standard error estimates, then tests whether the surrogate is valid
Usage
test.surrogate(full.data = NULL, yone = NULL, yzero = NULL, sone = NULL,
szero = NULL, epsilon = NULL, power.want.s = 0.7, u.y.hyp = NULL)
Arguments
full.data |
either full.data or yone, yzero, sone, szero must be supplied; if full data is supplied it must be in the following format: one observation per row, Y is in the first column, S is in the second column, treatment group (0 or 1) is in the third column. |
yone |
primary outcome, Y, in group 1 |
yzero |
primary outcome, Y, in group 0 |
sone |
surrogate marker, S, in group 1 |
szero |
surrogate marker, S, in group 0 |
epsilon |
threshold to use for delta, default calculates epsilon as a function of desired power for S |
power.want.s |
desired power for S, default is 0.7 |
u.y.hyp |
hypothesized value of u.y used in the calculation of epsilon, default uses estimated valued of u.y |
Value
u.y |
rank-based test statistic for Y |
u.s |
rank-based test statistic for S |
delta |
difference, u.y-u.s |
sd.u.y |
standard error estimate of u.y |
sd.u.s |
standard error estimate of u.s |
sd.delta |
standard error estimate of delta |
ci.delta |
1-sided confidence interval for delta |
epsilon.used |
the epsilon value used for the test |
is.surrogate |
logical, TRUE if test indicates S is a good surrogate, FALSE otherwise |
Author(s)
Layla Parast
Examples
data(example.data)
test.surrogate(yone = example.data$y1, yzero = example.data$y0, sone = example.data$s1,
szero = example.data$s0)
Tests whether the surrogate is valid, extended to the paired, two sided test setting
Description
Calculates the rank-based test statistic for Y and the rank-based test statistic for S and the difference, delta, along with corresponding standard error estimates, then tests whether the surrogate is valid. This extends the test.surrogate()
function to the case where samples may be paired instead of independent, and where a two sided test is desired.
Usage
test.surrogate.extension(
yone,
yzero,
sone,
szero,
alpha = 0.05,
power.want.s = NULL,
epsilon = NULL,
u.y.hyp = NULL,
alternative = "less",
paired = FALSE
)
Arguments
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group
with dimension |
szero |
matrix or dataframe of surrogate candidates in the untreated group
with dimension |
alpha |
significance level for determining surrogate candidates. Default is
|
power.want.s |
numeric in (0,1) - power desired for a test of treatment effect based
on the surrogate candidate. Either this or |
epsilon |
numeric in (0,1) - non-inferiority margin for determining surrogate
validity. Either this or |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
alternative |
character giving the alternative hypothesis type. One of
|
paired |
logical flag giving if the data is independent or paired. If
|
Value
A list containing:
-
u.y
Estimated rank-based treatment effect on the outcome. -
u.s
Estimated rank-based treatment effect on the surrogate. -
delta.estimate
Estimated difference in treatment effects:u.y - u.s
. -
sd.u.y
Standard deviation ofu.y
. -
sd.u.s
Standard deviation ofu.s
. -
sd.delta
Standard deviation ofdelta.estimate
. -
ci.delta
One-sided confidence interval upper bound fordelta.estimate
. -
p.delta
p-value for validity of trial-level surrogacy. -
epsilon.used
Non-inferiority threshold used in the test. -
is.surrogate
TRUE
if the surrogate passes the test, elseFALSE
.
Author(s)
Arthur Hughes, Layla Parast
Examples
# Load data
data("example.data")
yone <- example.data$y1
yzero <- example.data$y0
sone <- example.data$s1
szero <- example.data$s0
test.surrogate.extension.result <- test.surrogate.extension(
yone, yzero, sone, szero,
power.want.s = 0.8, paired = TRUE, alternative = "two.sided"
)
Performs RISE: Two-Stage Rank-Based Identification of High-Dimensional Surrogate Markers
Description
RISE (Rank-Based Identification of High-Dimensional Surrogate Markers) is a two-stage method to identify and evaluate high-dimensional surrogate candidates of a continuous response.
In the first stage (called screening), the high-dimensional candidates are screened one-by-one to identify strong candidates. Strength of surrogacy is assessed through a rank-based measure of the similarity in treatment effects on a candidate surrogate and the primary response. P-values corresponding to hypothesis testing on this measure are corrected for the high number of statistical tests performed.
In the second stage (called evaluation), candidates with an adjusted p-value below a given significance level are evaluated by combining them into a single synthetic marker. The surrogacy of this marker is then assessed with the univariate test as described before.
To avoid overfitting, the two stages are performed on separate data.
Usage
test.surrogate.rise(
yone,
yzero,
sone,
szero,
alpha = 0.05,
power.want.s = NULL,
epsilon = NULL,
u.y.hyp = NULL,
p.correction = "BH",
n.cores = 1,
alternative = "less",
paired = FALSE,
screen.proportion = 0.66,
return.all.screen = TRUE,
return.all.evaluate = TRUE,
return.plot.evaluate = TRUE,
evaluate.weights = TRUE
)
Arguments
yone |
numeric vector of primary response values in the treated group. |
yzero |
numeric vector of primary response values in the untreated group. |
sone |
matrix or dataframe of surrogate candidates in the treated group with
dimension |
szero |
matrix or dataframe of surrogate candidates in the untreated group with
dimension |
alpha |
significance level for determining surrogate candidates. Default is
|
power.want.s |
numeric in (0,1) - power desired for a test of treatment effect based on
the surrogate candidate. Either this or |
epsilon |
numeric in (0,1) - non-inferiority margin for determining surrogate
validity. Either this or |
u.y.hyp |
hypothesised value of the treatment effect on the primary response on the probability scale. If not given, it will be estimated based on the observations. |
p.correction |
character. Method for p-value adjustment (see |
n.cores |
numeric giving the number of cores to commit to parallel computation in
order to improve computational time through the |
alternative |
character giving the alternative hypothesis type. One of
|
paired |
logical flag giving if the data is independent or paired. If
|
screen.proportion |
numeric in (0,1) - proportion of data to be used for the screening stage.
The default is |
return.all.screen |
logical flag. If |
return.all.evaluate |
logical flag. If |
return.plot.evaluate |
logical flag. If |
evaluate.weights |
logical flag. If |
Value
a list with
-
screening.results
: a list with-
screening.metrics
: dataframe of screening results (for each candidate marker - delta, CI, sd, epsilon, p-values). -
significant_markers
: character vector of markers withp_adjusted < alpha
.
-
-
evaluate.results
: a list with-
individual.metrics
ifreturn.all.evaluate
=TRUE
, a dataframe of evaluation results for each significant marker. -
gamma.s
a list with elementsgamma.s.one
andgamma.s.zero
, giving the combined surrogate marker in the treated and untreated groups, respectively. -
gamma.s.evaluate
: a dataframe giving the evaluation ofgamma.s
-
gamma.s.plot
: a ggplot2 plot showinggamma.s
against the primary response on the rank-scale.
-
Author(s)
Arthur Hughes
Examples
# Load high-dimensional example data
data("example.data.highdim")
yone <- example.data.highdim$y1
yzero <- example.data.highdim$y0
sone <- example.data.highdim$s1
szero <- example.data.highdim$s0
rise.result <- test.surrogate.rise(yone, yzero, sone, szero, power.want.s = 0.8)