Help for package perccalc

Title:

Estimate Percentiles from an Ordered Categorical Variable

Version:

1.0.5

Description:

An implementation of two functions that estimate values for percentiles from an ordered categorical variable as described by Reardon (2011, isbn:978-0-87154-372-1). One function estimates percentile differences from two percentiles while the other returns the values for every percentile from 1 to 100.

Depends:

R (≥ 3.4.0)

License:

MIT + file LICENSE

URL:

https://cimentadaj.github.io/perccalc/, https://github.com/cimentadaj/perccalc

Language:

en-US

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.0.1

Imports:

stats, tibble, multcomp

Suggests:

magrittr, spelling, dplyr, knitr, rmarkdown, testthat, ggplot2, MASS, carData, tidyr (≥ 1.0.0), covr

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2019-12-17 17:22:57 UTC; jorge

Author:

Jorge Cimentada

[aut, cre]

Maintainer:

Jorge Cimentada <cimentadaj@gmail.com>

Repository:

CRAN

Date/Publication:

2019-12-17 20:10:02 UTC

Calculate percentile differences from an ordered categorical variable and a continuous variable.

Description

Calculate percentile differences from an ordered categorical variable and a continuous variable.

Usage

perc_diff(
  data_model,
  categorical_var,
  continuous_var,
  weights = NULL,
  percentiles = c(90, 10)
)

perc_diff_df(
  data_model,
  categorical_var,
  continuous_var,
  weights = NULL,
  percentiles = c(90, 10)
)

Arguments

data_model

A data frame with at least the categorical and continuous variables from which to estimate the percentile differences

categorical_var

The bare unquoted name of the categorical variable. This variable SHOULD be an ordered factor. If not, will raise an error.

continuous_var

The bare unquoted name of the continuous variable from which to estimate the percentiles

weights

The bare unquoted name of the optional weight variable. If not specified, then estimation is done without weights

percentiles

A numeric vector of two numbers specifying which percentiles to subtract

Details

perc_diff drops missing observations silently for calculating the linear combination of coefficients.

Value

perc_diff returns a vector with the percentile difference and its associated standard error. perc_diff_df returns the same but as a data frame.

Examples



set.seed(23131)
N <- 1000
K <- 20

toy_data <- data.frame(id = 1:N,
                       score = rnorm(N, sd = 2),
                       type = rep(paste0("inc", 1:20), each = N/K),
                       wt = 1)


# perc_diff(toy_data, type, score)
# type is not an ordered factor!

toy_data$type <- factor(toy_data$type, levels = unique(toy_data$type), ordered = TRUE)

perc_diff(toy_data, type, score, percentiles = c(90, 10))
perc_diff(toy_data, type, score, percentiles = c(50, 10))

perc_diff(toy_data, type, score, weights = wt, percentiles = c(30, 10))
# Results as data frame
perc_diff_df(toy_data, type, score, weights = wt, percentiles = c(30, 10))

Calculate a distribution of percentiles from an ordered categorical variable and a continuous variable.

Description

Calculate a distribution of percentiles from an ordered categorical variable and a continuous variable.

Usage

perc_dist(data_model, categorical_var, continuous_var, weights = NULL)

Arguments

data_model

A data frame with at least the categorical and continuous variables from which to estimate the percentiles

categorical_var

The bare unquoted name of the categorical variable. This variable should be an ordered factor. If not, will raise an error.

continuous_var

The bare unquoted name of the continuous variable from which to estimate the percentiles

weights

The bare unquoted name of the optional weight variable. If not specified, then equal weights are assumed.

Details

perc_dist drops missing observations silently for calculating the linear combination of coefficients.

Value

A data frame with the scores and standard errors for each percentile

Examples


set.seed(23131)
N <- 1000
K <- 20

toy_data <- data.frame(id = 1:N,
                       score = rnorm(N, sd = 2),
                       type = rep(paste0("inc", 1:20), each = N/K),
                       wt = 1)


# perc_diff(toy_data, type, score)
# type is not an ordered factor!

toy_data$type <- factor(toy_data$type, levels = unique(toy_data$type), ordered = TRUE)

perc_dist(toy_data, type, score)

Mathematics test scores of Spain, Germany and Estonia in the PISA 2006 test

Description

A dataset containing the test scores and other household information of students from Spain, Germany and Estonia from the PISA 2006 test.

Usage

pisa_2006

Format

A data frame with 25884 rows and 10 variables:

year: Year of the survey
CNT: Long country names
STIDSTD: Unique student id
father_edu: The father's highest achieved degree in the ISCED scale
household_income: The household's total income in categories
avg_math: The average math test score out of the 5 plausible values in Mathematics

Source

A subset extracted from the PISA2006lite R package, https://github.com/pbiecek/PISA2012lite

Mathematics test scores of Spain, Germany and Estonia in the PISA 2012 test

Description

A dataset containing the test scores and other household information of students from Spain, Germany and Estonia from the PISA 2012 test.

Usage

pisa_2012

Format

A data frame with 35093 rows and 10 variables:

year: Year of the survey
CNT: Long country names
STIDSTD: Unique student id
father_edu: The father's highest achieved degree in the ISCED scale
household_income: The household's total income in categories
avg_math: The average math test score out of the 5 plausible values in Mathematics

Source

A subset extracted from the PISA2012lite R package, https://github.com/pbiecek/PISA2012lite