Type: | Package |
Title: | Partial Profile Score Feature Selection in High-Dimensional Generalized Linear Interaction Models |
Version: | 0.1.1 |
Date: | 2025-07-04 |
Maintainer: | Zengchao Xu <zengc.xu@aliyun.com> |
Description: | This is an implementation of the partial profile score feature selection (PPSFS) approach to generalized linear (interaction) models. The PPSFS is highly scalable even for ultra-high-dimensional feature space. See the paper by Xu, Luo and Chen (2022, <doi:10.4310/21-SII706>). |
URL: | https://github.com/paradoxical-rhapsody/PPSFS |
BugReports: | https://github.com/paradoxical-rhapsody/PPSFS/issues |
Imports: | Rcpp, brglm2 |
LinkingTo: | Rcpp, RcppArmadillo |
License: | GPL-3 |
Encoding: | UTF-8 |
Language: | en-US |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | yes |
Packaged: | 2025-07-04 01:30:40 UTC; zengchao |
Author: | Zengchao Xu [aut, cre], Shan Luo [aut], Zehua Chen [aut] |
Repository: | CRAN |
Date/Publication: | 2025-07-04 02:30:02 UTC |
PPSFS: Partial Profile Score Feature Selection in High-Dimensional Generalized Linear Interaction Models
Description
This is an implementation of the partial profile score feature selection (PPSFS) approach to generalized linear (interaction) models. The PPSFS is highly scalable even for ultra-high-dimensional feature space. See the paper by Xu, Luo and Chen (2022, doi:10.4310/21-SII706).
Author(s)
Maintainer: Zengchao Xu zengc.xu@aliyun.com
Authors:
Shan Luo
Zehua Chen
See Also
Useful links:
Report bugs at https://github.com/paradoxical-rhapsody/PPSFS/issues
Partial Profile Score Feature Selection for GLMs
Description
ppsfs
: PPSFS for main-effects.
ppsfsi
: PPSFS for interaction effects.
Usage
ppsfs(
x,
y,
family,
keep = NULL,
I0 = NULL,
...,
ebicFlag = 1,
maxK = min(NROW(x) - 1, NCOL(x) + length(I0)),
verbose = FALSE
)
ppsfsi(
x,
y,
family,
keep = NULL,
...,
ebicFlag = 1,
maxK = min(NROW(x) - 1, choose(NCOL(x), 2)),
verbose = FALSE
)
Arguments
x |
Matrix. |
y |
Vector. |
family |
|
keep |
Initial set of features that are included in model fitting. |
I0 |
Index set of interaction effects to be identified. |
... |
Additional parameters for glm.fit. |
ebicFlag |
The procedure stops when the EBIC increases after |
maxK |
Maximum number of identified features. |
verbose |
Print the procedure path? |
Details
That ppsfs(x, y, family="gaussian")
is an implementation to
sequential lasso method proposed by Luo and Chen(2014, <\doi{10/f6kfr6}>).
Value
Index set of identified features.
References
Z. Xu, S. Luo and Z. Chen (2022). Partial profile score feature selection in high-dimensional generalized linear interaction models. Statistics and Its Interface. doi:10.4310/21-SII706
Examples
## ***************************************************
## Identify main-effect features
## ***************************************************
set.seed(2022)
n <- 300
p <- 1000
x <- matrix(rnorm(n*p), n)
eta <- drop( x[, 1:3] %*% runif(3, 1.0, 1.5) )
y <- eta + rnorm(n, sd=sd(eta)/5)
print( A <- ppsfs(x, y, 'gaussian', verbose=TRUE) )
## ***************************************************
## Identify interaction effects
## ***************************************************
set.seed(2022)
n <- 300
p <- 150
x <- matrix(rnorm(n*p), n)
eta <- drop( cbind(x[, 1:3], x[, 4:6]*x[, 7:9]) %*% runif(6, 1.0, 1.5) )
y <- eta + rnorm(n, sd=sd(eta)/5)
print( group <- ppsfsi(x, y, 'gaussian', verbose=TRUE) )
print( A <- ppsfs(x, y, "gaussian", I0=group, verbose=TRUE) )
print( A <- ppsfs(x, y, "gaussian", keep=c(1, "5:8"),
I0=group, verbose=TRUE) )