Help for package ordinalgmifs

Version:

1.0.8

Date:

2023-05-01

Title:

Ordinal Regression for High-Dimensional Data

Depends:

R (≥ 4.2.0), survival

Description:

Provides a function for fitting cumulative link, adjacent category, forward and backward continuation ratio, and stereotype ordinal response models when the number of parameters exceeds the sample size, using the the generalized monotone incremental forward stagewise method.

License:

GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]

Imports:

methods

BuildResaveData:

best

SystemRequirements:

C++11

NeedsCompilation:

yes

BuildVignettes:

TRUE

LazyData:

true

Packaged:

2023-05-04 12:01:04 UTC; archer.43

Author:

Kellie J. Archer

[aut, cre], Jiayi Hou [aut], Qing Zhou [aut], Kyle Ferber [aut], John G. Layne [com, ctr], Amanda Gentry [rev]

Maintainer:

Kellie J. Archer <archer.43@osu.edu>

Repository:

CRAN

Date/Publication:

2023-05-05 07:30:06 UTC

Ordinal Response Regression for High-Dimensional Data

Description

This package provides a function, ordinalgmifs, for fitting cumulative link, adjacent category, forward and backward continuation ratio, and stereotype ordinal response models when the number of parameters exceeds the sample size, using the the generalized monotone incremental forward stagewise method.

Details

Package:	ordinalgmifs
Version:	1.0.8
Date:	2023-05-01
Title:	Ordinal Regression for High-Dimensional Data
Authors@R:	c(person(c("Kellie", "J."), "Archer", email = "archer.43@osu.edu", role = c("aut", "cre"), comment = c(ORCID = "0000-0003-1555-5781")), person("Jiayi", "Hou", role = "aut"), person("Qing", "Zhou", role = "aut"), person("Kyle","Ferber", role = "aut"), person(c("John", "G."), "Layne", role = c("com","ctr")), person("Amanda", "Gentry", role = "rev") )
Depends:	R (>= 4.2.0), survival
Description:	Provides a function for fitting cumulative link, adjacent category, forward and backward continuation ratio, and stereotype ordinal response models when the number of parameters exceeds the sample size, using the the generalized monotone incremental forward stagewise method.
License:	GPL (>= 2)
Imports:	methods
BuildResaveData:	best
SystemRequirements:	C++11
NeedsCompilation:	yes
BuildVignettes:	TRUE
LazyData:	true
Author:	Kellie J. Archer [aut, cre] (<https://orcid.org/0000-0003-1555-5781>), Jiayi Hou [aut], Qing Zhou [aut], Kyle Ferber [aut], John G. Layne [com, ctr], Amanda Gentry [rev]
Maintainer:	Kellie J. Archer <archer.43@osu.edu>

Index of help topics:

coef.ordinalgmifs       Extract Model Coefficients
eyedisease              Eye Disease Risk Factors
hccframe                Liver Cancer Methylation Data
ordinalgmifs            Ordinal Generalized Monotone Incremental
                        Forward Stagewise Regression
ordinalgmifs-package    Ordinal Response Regression for
                        High-Dimensional Data
plot.ordinalgmifs       Plot Solution Path for Ordinal GMIFS Fitted
                        Model.
predict.ordinalgmifs    Predicted Probabilities and Class for Ordinal
                        GMIFS Fit.
print.ordinalgmifs      Print the Contents of an Ordinal GMIFS Fitted
                        Object.
summary.ordinalgmifs    Summarize an Ordinal GMIFS Object.

This package contains generic methods (coef, plot, predict, print, summary) that can be invoked for an object fitted using ordinalgmifs.

Author(s)

NA Kellie J. Archer, Jiayi Hou, Qing Zhou, Kyle Ferber, John G. Layne, Amanda Gentry

Maintainer: NA Kellie J. Archer <archer.43@osu.edu>

References

Hastie T., Taylor J., Tibshirani R., and Walther G. (2007) Forward stagewise regression and the monotone lasso. Electronic Journal of Statistics, 1, 1-29.

Extract Model Coefficients

Description

coef.ordinalgmifs is a generic function which extracts the model coefficients from a fitted model object fit using ordinalgmifs

Usage

## S3 method for class 'ordinalgmifs'
coef(object, model.select = "AIC", ...)

Arguments

object

an ordinalgmifs object.

model.select

when x is specified any model along the solution path can be selected. The default is model.select="AIC" which extracts the coefficients from the model having the lowest AIC. Other options are model.select="BIC" or any numeric value from the solution path.

...

other arguments.

Value

Coefficients extracted from the model object.

Author(s)

Kellie J. Archer

References

Hastie T., Taylor J., Tibshirani R., and Walther G. (2007) Forward stagewise regression and the monotone lasso. Electronic Journal of Statistics, 1, 1-29.

Eye Disease Risk Factors

Description

Eye Disease Risk Factors data from Section 9.1 of Agresti's Analysis of Ordinal Categorical Data. The primary data are from the Wisconsin Epidemiological Study of Diabetic Retinopathy. The primary outcome is severity of retinopathy which was measured in the left and right eye of every subject.

Usage

data(eyedisease)

Format

A data frame with 720 observations on the following 19 variables.

rme: right eye macular oedema (absent = 0, present = 1)
lme: left eye macular oedema (absent = 0, present = 1)
rre: right eye refraction index
lre: left eye refraction index
riop: right eye intraocular eye pressure
liop: left eye intraocular eye pressure
age: age
diab: duration of diabetes (in years)
gh: glycosylated haemoglobin level
sbp: systolic blood pressure
dbp: diastolic blood pressure
bmi: body mass index
pr: pulse rate?
sex: gender (male=1, female=2)
prot: proteinuria (absent = 0, present = 1)
dose: a numeric vector
rerl: right eye severity of retinopathy, an ordered factor with levels None < Mild < Moderate < Proliferative
lerl: left eye severity of retinopathy, an ordered factor with levels None < Mild < Moderate < Proliferative
id: subject identifier

References

R. Klein and B.E.K. Klein and S.E. Moss and M.D. Davis and D.L. DeMets. (1984) The Wisconsin Epidemiologic Study of Diabetic Retinopathy II. Prevalence and risk of diabetic retinopathy when age at diagnosis is less than 30 years. Archives of Opthalmology 101, 520-526.

J. Williamson and K. Kim. (1996) A global odds ratio regression model for bivariate ordered categorical data from opthalmologic studies. Statistics in Medicine 15: 1507-1518.

A. Agresti. (2010) Analysis of Ordered Categorical Data, Second Edition. Wiley. Hoboken, NJ.

Examples

data(eyedisease)

Liver Cancer Methylation Data

Description

These data are a subset of subjects and CpG sites reported in the original paper where liver samples were assayed using the Illumina GoldenGate Methylation BeadArray Cancer Panel I. Technical replicate samples were removed to ensure all samples were independent. The matched cirrhotic samples from subjects with hepatocellular carcinoma (HCC, labeled Tumor) were also excluded. Therefore methylation levels in liver tissue are provided for independent subjects whose liver was Normal (N=20), cirrhotic but not having HCC (N=16, Cirrhosis non-HCC), and HCC (N=20, Tumor).

Usage

data(hccframe)

Format

A data frame with 56 observations on the following 46 variables.

group: an ordered factor with levels Normal < Cirrhosis non-HCC < Tumor
CDKN2B_seq_50_S294_F: a numeric vector representing a CpG site proportion methylation for CDKN2B
DDIT3_P1313_R: a numeric vector representing a CpG site proportion methylation for DDIT3
ERN1_P809_R: a numeric vector representing a CpG site proportion methylation for ERN1
GML_E144_F: a numeric vector representing a CpG site proportion methylation for GML
HDAC9_P137_R: a numeric vector representing a CpG site proportion methylation for HDAC9
HLA.DPA1_P205_R: a numeric vector representing a CpG site proportion methylation for HLA.DPA1
HOXB2_P488_R: a numeric vector representing a CpG site proportion methylation for HOXB2
IL16_P226_F: a numeric vector representing a CpG site proportion methylation for IL16
IL16_P93_R: a numeric vector representing a CpG site proportion methylation for IL16
IL8_P83_F: a numeric vector representing a CpG site proportion methylation for IL8
MPO_E302_R: a numeric vector representing a CpG site proportion methylation for MPO
MPO_P883_R: a numeric vector representing a CpG site proportion methylation for MPO
PADI4_P1158_R: a numeric vector representing a CpG site proportion methylation for PADI4
SOX17_P287_R: a numeric vector representing a CpG site proportion methylation for SOX17
TJP2_P518_F: a numeric vector representing a CpG site proportion methylation for TJP2
WRN_E57_F: a numeric vector representing a CpG site proportion methylation for WRN
CRIP1_P874_R: a numeric vector representing a CpG site proportion methylation for CRIP1
SLC22A3_P634_F: a numeric vector representing a CpG site proportion methylation for SLC22A3
CCNA1_P216_F: a numeric vector representing a CpG site proportion methylation for CCNA1
SEPT9_P374_F: a numeric vector representing a CpG site proportion methylation for SEPT9
ITGA2_E120_F: a numeric vector representing a CpG site proportion methylation for ITGA2
ITGA6_P718_R: a numeric vector representing a CpG site proportion methylation for ITGA6
HGF_P1293_R: a numeric vector representing a CpG site proportion methylation for HGF
DLG3_E340_F: a numeric vector representing a CpG site proportion methylation for DLG3
APP_E8_F: a numeric vector representing a CpG site proportion methylation for APP
SFTPB_P689_R: a numeric vector representing a CpG site proportion methylation for SFTPB
PENK_P447_R: a numeric vector representing a CpG site proportion methylation for PENK
COMT_E401_F: a numeric vector representing a CpG site proportion methylation for COMT
NOTCH1_E452_R: a numeric vector representing a CpG site proportion methylation for NOTCH1
EPHA8_P456_R: a numeric vector representing a CpG site proportion methylation for EPHA8
WT1_P853_F: a numeric vector representing a CpG site proportion methylation for WT1
KLK10_P268_R: a numeric vector representing a CpG site proportion methylation for KLK10
PCDH1_P264_F: a numeric vector representing a CpG site proportion methylation for PCDH1
TDGF1_P428_R: a numeric vector representing a CpG site proportion methylation for TDGF1
EFNB3_P442_R: a numeric vector representing a CpG site proportion methylation for EFNB3
MMP19_P306_F: a numeric vector representing a CpG site proportion methylation for MMP19
FGFR2_P460_R: a numeric vector representing a CpG site proportion methylation for FGFR2
RAF1_P330_F: a numeric vector representing a CpG site proportion methylation for RAF1
BMPR2_E435_F: a numeric vector representing a CpG site proportion methylation for BMPR2
GRB10_P496_R: a numeric vector representing a CpG site proportion methylation for GRB10
CTSH_P238_F: a numeric vector representing a CpG site proportion methylation for CTSH
SLC6A8_seq_28_S227_F: a numeric vector representing a CpG site proportion methylation for SLC6A8
PLXDC1_P236_F: a numeric vector representing a CpG site proportion methylation for PLXDC1
TFE3_P421_F: a numeric vector representing a CpG site proportion methylation for TFE3
TSG101_P139_R: a numeric vector representing a CpG site proportion methylation for TSG101

Source

The full dataset is available as GSE18081 from Gene Expression Omnibus at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE18081

References

Archer KJ, Mas VR, Maluf DG, Fisher RA. High-throughput assessment of CpG site methylation for distinguishing between HCV-cirrhosis and HCV-associated hepatocellular carcinoma. Molecular Genetics and Genomics, 283(4): 341-349, 2010.

Examples

data(hccframe)

Ordinal Generalized Monotone Incremental Forward Stagewise Regression

Description

This function can fit a cumulative link, adjacent category, forward and backward continuation ratio, and stereotype ordinal response model when the number of parameters exceeds the sample size, using the the generalized monotone incremental forward stagewise method.

Usage

ordinalgmifs(formula, data, x = NULL, subset, epsilon = 0.001, tol = 1e-05, 
	scale = TRUE, probability.model = "Cumulative", link = "logit", 
	verbose=FALSE, assumption=NULL, ...)

Arguments

formula

an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The left side of the formula is the ordinal outcome while the variables on the right side of the formula are the covariates that are not included in the penalization process. Note that if all variables in the model are to be penalized, an intercept only model formula should be specified.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model.

x

an optional matrix of predictors that are to be penalized in the model fitting process.

subset

an optional vector specifying a subset of observations to be used in the fitting process.

epsilon

small incremental amount used to update a coefficient at a given step.

tol

the iterative process stops when the difference between successive log-likelihoods is less than this specified level of tolerance.

scale

logical, if TRUE the penalized predictors are centered and scaled.

probability.model

the type of ordinal response model to be fit. Can be "Cumulative", "AdjCategory", "ForwardCR", "BackwardCR", or "Stereotype"

link

the link function used. Allowable links for "Cumulative", "ForwardCR", and "BackwardCR" are "logit", "probit", and "cloglog". For an "AdjCategory" model only a "loge" link is allowed; for a "Stereotype" model only a "logit" link is allowed.

verbose

logical, if TRUE the step number is printed to the console (default is FALSE).

assumption

integer, only use with probability.model = "ForwardCR" and link = "cloglog" to denote the assumption to use for discrete censored survival modeling. If assumption = 1, assume the observation was censored at the end of the discrete time interval in which the censoring occurred; if assumption = 2, assume the observation was censored at the beginning of the interval in which censoring occurred; if assumption = 3, assume constant hazard rate within the interval in which the censoring occurred; if no censoring occurs, do not specify a value for assumption.

...

additional arguments

Details

A model specified as response~terms, x=penalized.terms where response is the ordinal response vector and terms is the series of variables in the model that are not to be penalized and x is a matrix of variables that are to be penalized. For example, terms may include the variables age and gender while x includes hundreds to thousands of features from a high-throughput genomic experiment. In the event that no baseline demographic/clinical characteristics/subject level variables are available or needed in terms (all variables are to be penalized) then the model is specified as response~1, x=penalized.terms.

Value

AIC

a vector of AIC values for each step (if x is specified).

BIC

a vector of BIC values for each step (if x is specified).

alpha

the ordinal threshold estimates for the fitted model.

theta

the coefficient estimates for the unpenalized variables (if terms are specified on the right hand side of the model formula).

beta

the coefficient estimates for the penalized variables (if x is specified in the model).

phi

the scaling coefficient estimates (if a "Stereotype" logit model is fit).

logLik

a vector of log-likelihood values for each step(if terms are specified on the right hand side of the model formula).

link

the link function used in the model fit.

model.select

the step at which the minimum AIC was observed (if terms are specified on the right hand side of the model formula).

probability.model

the model fit.

scale

logical indicating whether penalized variables were centered and scaled.

w

the unpenalized variables in the model (if any).

x

the penalized variables in the model (if any).

y

the ordinal response.

Author(s)

Kellie J. Archer, Jiayi Hou, Qing Zhou, Kyle Ferber, John G. Layne, Amanda Gentry

References

Hastie T., Taylor J., Tibshirani R., and Walther G. (2007) Forward stagewise regression and the monotone lasso. Electronic Journal of Statistics, 1, 1-29.

Examples

data(hccframe)
# To minimize processing time, MPO_E302_R is coerced into the model and only a subset of 
# two CpG sites (DDIT3_P1313_R and HDAC9_P137_R) are included as penalized covariates
# in this demonstration, and epsilon is set to 0.01
hcc.fit <- ordinalgmifs(group ~ MPO_E302_R, x = c("DDIT3_P1313_R", "HDAC9_P137_R"), 
	data = hccframe, epsilon = 0.01)
coef(hcc.fit)
summary(hcc.fit)
phat <- predict(hcc.fit)
head(phat$predicted)
table(phat$class, hccframe$group)

Functions Called by ordinalgmifs Functions, Not by the User

Description

These functions are called my other ordinalgmifs functions and are not intended to be directly called by the user.

Details

The du.adjcat, du.bcr, du.cum, du.fcr, and du.stereo functions calculate the derivatives at the current step for the adjacent category, backward CR, cumulative link, forward CR, and stereotype logit models, respectively, are used to identify which penalized parameter is updated. The fn.acat, fn.bcr, fn.cum, fn.fcr, and fn.stereo are the log-likelihood functions for the adjacent category, backward CR, cumulative link, forward CR, and stereotype logit models, respectively, are used to estimate the thresholds and non-penalized subset parameters (if included) at each step of the algorithm. The G function returns the probability for the indicated link function. The gradient function returns the gradient of the log-likelihood for the cumulative link models and is used for the cumulative link constrained optimization.

Value

these functions are called for intermediate results used internally by user-invoked functions

Author(s)

Kellie J. Archer, archer.43@osu.edu

Plot Solution Path for Ordinal GMIFS Fitted Model.

Description

This function plots either the coefficient path, the AIC, or the log-likelihood for a fitted ordinalgmifs object.

Usage

## S3 method for class 'ordinalgmifs'
plot(x, type = "trace", xlab=NULL, ylab=NULL, main=NULL, ...)

Arguments

x

an ordinalgmifs object.

type

default is "trace" which plots the coefficient path for the fitted object. Also available are "AIC", "BIC", and "logLik".

xlab

a default x-axis label will be used which can be changed by specifying a user-defined x-axis label.

ylab

a default y-axis label will be used which can be changed by specifying a user-defined y-axis label.

main

a default main title will be used which can be changed by specifying a user-defined main title.

...

other arguments.

Value

No return value, called for side effects

Author(s)

Kellie J. Archer

Predicted Probabilities and Class for Ordinal GMIFS Fit.

Description

This function returns a list the includes the predicted probabilities as well as the predicted class for an ordinalgmifs fitted object.

Usage

## S3 method for class 'ordinalgmifs'
predict(object, neww = NULL, newdata, newx = NULL, model.select = "AIC", ...)

Arguments

object

an ordinalgmifs fitted object.

neww

an optional formula that includes the unpenalized variables to use for predicting the response. If omitted, the training data are used.

newdata

an optional data.frame that minimally includes the unpenalized variables to use for predicting the response. If omitted, the training data are used.

newx

an optional matrix of penalized variables to use for predicting the response. If omitted, the training data are used.

model.select

when x is specified any model along the solution path can be selected. The default is model.select="AIC" which calculates the predicted values using the coefficients from the model having the lowest AIC. Other options are model.select="BIC" or any numeric value from the solution path.

...

other arguments.

Value

predicted

a matrix of predicted probabilities from the fitted model.

class

a vector containing the predicted class taken as that class having the largest predicted probability.

...

other arguments.

Author(s)

Kellie J. Archer, Jiayi Hou, Qing Zhou, Kyle Ferber, John G. Layne, Amanda Gentry

Print the Contents of an Ordinal GMIFS Fitted Object.

Description

This function prints the names of the list objects from an ordinalgmifs fitted model.

Usage

## S3 method for class 'ordinalgmifs'
print(x, ...)

Arguments

x

an ordinalgmifs object.

...

other arguments.

Value

returns the object names in the fitted ordinalgmifs object

Note

The contents of an ordinalgmifs fitted object differ depending upon whether x is specified in the ordinalgmifs model (i.e., penalized variables are included in the model fit hence a solution path is returned) or only terms on the right hand side of the equation are included (unpenalized variables). In the latter case, we recommend using the VGAM package.

Author(s)

Kellie J. Archer

Summarize an Ordinal GMIFS Object.

Description

summary method for class ordinalgmifs.

Usage

## S3 method for class 'ordinalgmifs'
summary(object, model.select = "AIC", ...)

Arguments

object

an ordinalgmifs object.

model.select

when x is specified any model along the solution path can be selected. The default is model.select="AIC" which extracts the model having the lowest AIC. Other options are model.select="BIC" or any numeric value from the solution path.

...

other arguments.

Details

Prints the following items extracted from the fitted ordinalgmifs object: the probability model and link used and model parameter estimates. For models that include x, the parameter estimates, AIC, BIC, and log-likelihood are printed for indicated model.select step or if model.select is not supplied the step at which the minimum AIC was observed.

Value

extracts the relevant information from the step in the solution path that attained the minimum AIC (default) or at the user-defined model.select step

Author(s)

Kellie J. Archer

Ordinal Response Regression for High-Dimensional Data

Description

Details

Author(s)

References

See Also

Extract Model Coefficients

Description

Usage

Arguments

Value

Author(s)

References

See Also

Eye Disease Risk Factors

Description

Usage

Format

References

See Also

Examples

Liver Cancer Methylation Data

Description

Usage

Format

Source

References

See Also

Examples

Ordinal Generalized Monotone Incremental Forward Stagewise Regression

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Functions Called by ordinalgmifs Functions, Not by the User

Description

Details

Value

Author(s)

See Also

Plot Solution Path for Ordinal GMIFS Fitted Model.

Description

Usage

Arguments

Value

Author(s)

See Also

Predicted Probabilities and Class for Ordinal GMIFS Fit.

Description

Usage

Arguments

Value

Author(s)

See Also

Print the Contents of an Ordinal GMIFS Fitted Object.

Description

Usage

Arguments

Value

Note

Author(s)

See Also

Summarize an Ordinal GMIFS Object.

Description

Usage

Arguments

Details

Value

Author(s)

See Also