| Type: | Package | 
| Title: | A few Useful Functions for Statisticians | 
| Version: | 2.4 | 
| Date: | 2025-03-24 | 
| Maintainer: | Hugo Varet <varethugo@gmail.com> | 
| Depends: | survival | 
| Imports: | stats, graphics, WriteXLS (≥ 2.3.0) | 
| Description: | Various useful functions for statisticians: describe data, plot Kaplan-Meier curves with numbers of subjects at risk, compare data sets, display spaghetti-plot, build multi-contingency tables... | 
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] | 
| RoxygenNote: | 7.3.2 | 
| NeedsCompilation: | no | 
| Packaged: | 2025-03-24 10:42:54 UTC; hvaret | 
| Author: | Hugo Varet [aut, cre] | 
| Repository: | CRAN | 
| Date/Publication: | 2025-03-24 11:10:02 UTC | 
A few useful functions for statisticians
Description
Various useful functions for statisticians: describe data, plot Kaplan-Meier curves with numbers of subjects at risk, compare data sets, display spaghetti-plot, build multi-contingency tables...
Author(s)
Hugo Varet
OR and their confidence intervals for logistic regressions
Description
Computes odd ratios and their confidence intervals for logistic regressions
Usage
IC_OR_glm(model, alpha = 0.05)
Arguments
| model | a  | 
| alpha | type I error, 0.05 by default | 
Value
A matrix with the estimaed coefficients of the logistic model, their s.e., z-values, p-values, OR and CI of the OR
Author(s)
Hugo Varet
Examples
IC_OR_glm(glm(inherit~sex+age,data=cgd,family="binomial"))
RR and their confidence intervals for Cox models
Description
Computess risk ratios and their confidence intervals for Cox models
Usage
IC_RR_coxph(model, alpha = 0.05, sided = 2)
Arguments
| model | a  | 
| alpha | type I error, 0.05 by default | 
| sided | 1 or 2 for one or two-sided | 
Value
A matrix with the estimaed coefficients of the Cox model, their s.e., z-values, p-values, RR and CI of the RR
Author(s)
Hugo Varet
Examples
cgd$time=cgd$tstop-cgd$tstart
IC_RR_coxph(coxph(Surv(time,status)~sex+age,data=cgd),alpha=0.05,sided=1)
Comparing two databases assumed to be identical
Description
Compares two data frames assumed to be identical, prints the differences in the console and also returns the results in a data frame
Usage
compare(d1, d2, id, file.export = NULL)
Arguments
| d1 | first data frame | 
| d2 | second data frame | 
| id | character string, primary key of the two data bases | 
| file.export | character string, name of the XLS file exported | 
Value
A data frame containing the differences between the two data bases
Author(s)
Hugo Varet
Examples
N=100
data1=data.frame(id=1:N,a=rnorm(N),
                        b=factor(sample(LETTERS[1:5],N,TRUE)),
                        c=as.character(sample(LETTERS[1:5],N,TRUE)),
                        d=as.Date(32768:(32768+N-1),origin="1900-01-01"))
data1$c=as.character(data1$c)
data2=data1
data2$id[3]=4654
data2$a[30]=NA
data2$a[31]=45
data2$b=as.character(data2$b)
data2$d=as.character(data2$d)
data2$e=rnorm(N)
compare(data1,data2,"id")
Convert variables of a data frame in factors
Description
Converts variables of a data frame in factors
Usage
convert_factor(data, vars)
Arguments
| data | the data frame in which we can find  | 
| vars | vector of character string of covariates | 
Value
The modified data frame
Author(s)
Hugo Varet
Examples
cgd$steroids
cgd$status
cgd=convert_factor(cgd,c("steroids","status"))
Convert 0s in NA
Description
Converts 0s in NA
Usage
convert_zero_NA(data, vars)
Arguments
| data | the data frame in which we can find  | 
| vars | a character vector of covariates for which to transform 0s in  | 
Value
The modified data frame
Author(s)
Hugo Varet
Examples
my.data=data.frame(x=rbinom(20,1,0.5),y=rbinom(20,1,0.5),z=rbinom(20,1,0.5))
my.data=convert_zero_NA(my.data,c("y","z"))
Cut a quantitative variable in n equal parts
Description
Cuts a quantitative variable in n equal parts
Usage
cut_quanti(x, n, ...)
Arguments
| x | a numeric vector | 
| n | numeric, the number of parts: 2 to cut according to the median, and so on... | 
| ... | other arguments to be passed in  | 
Value
A factor vector
Author(s)
Hugo Varet
Examples
cut_quanti(cgd$height, 3)
Making descriptive statistics
Description
Makes descriptive statistics of a data frame according to a group covariate or not, can export the results
Usage
desc(
  data,
  vars,
  group = NULL,
  whole = TRUE,
  vars.labels = vars,
  group.labels = NULL,
  type.quanti = "mean",
  test.quanti = "param",
  test = TRUE,
  noquote = TRUE,
  justify = TRUE,
  digits = 2,
  file.export = NULL,
  language = "english"
)
Arguments
| data | data frame to describe in which we can find  | 
| vars | vector of character strings of the covariates to describe | 
| group | character string, statistics created for each levels of this covariate | 
| whole | boolean,  | 
| vars.labels | vector of character string for sweeter names of covariates in the output | 
| group.labels | vector of character string for sweeter column names | 
| type.quanti | character string,  | 
| test.quanti | character string,  | 
| test | boolean,  | 
| noquote | boolean,  | 
| justify | boolean,  | 
| digits | number of digits of the statistics (mean, sd, median, min, max, Q1, Q3, %), p-values always have 3 digits | 
| file.export | character string, name of the XLS file exported | 
| language | character string,  | 
Value
A matrix of the descriptive statistics
Author(s)
Hugo Varet
Examples
cgd$steroids=factor(cgd$steroids)
cgd$status=factor(cgd$status)
desc(cgd,vars=c("center","sex","age","height","weight","steroids","status"),group="treat")
Plot a histogram with a boxplot below
Description
Plots a histogram with a boxplot below
Usage
hist_boxplot(
  x,
  freq = TRUE,
  density = FALSE,
  main = NULL,
  xlab = NULL,
  ymax = NULL,
  col.hist = "lightblue",
  col.boxplot = "lightblue",
  ...
)
Arguments
| x | a numeric vector | 
| freq | boolean,  | 
| density | boolean,  | 
| main | character string, main title of the histogram | 
| xlab | character string, label of the x axis | 
| ymax | numeric value, maximum of the y axis | 
| col.hist | color of the histogram | 
| col.boxplot | color of the boxplot | 
| ... | other arguments to be passed in  | 
Value
None
Author(s)
Hugo Varet
Examples
par(mfrow=c(1,2))
hist_boxplot(rnorm(100),col.hist="lightblue",col.boxplot="red",freq=TRUE)
hist_boxplot(rnorm(100),col.hist="lightblue",col.boxplot="red",freq=FALSE,density=TRUE)
Multi cross table
Description
Builds a big cross table between several covariates
Usage
multi.table(data, vars)
Arguments
| data | the data frame in which we can find  | 
| vars | vector of character string of covariates | 
Value
A matrix containing all the contingency tables between the covariates
Author(s)
Hugo Varet
See Also
Examples
multi.table(cgd,c("treat","sex","inherit"))
Kaplan-Meier plot with number of subjects at risk below
Description
Kaplan-Meier plot with number of subjects at risk below
Usage
plot_km(
  formula,
  data,
  test = TRUE,
  xy.pvalue = NULL,
  conf.int = FALSE,
  times.print = NULL,
  nrisk.labels = NULL,
  legend = NULL,
  xlab = NULL,
  ylab = NULL,
  ylim = c(0, 1.02),
  left = 4.5,
  bottom = 5,
  cex.mtext = par("cex"),
  lwd = 2,
  lty = 1,
  col = NULL,
  ...
)
Arguments
| formula | same formula than in  | 
| data | data frame with  | 
| test | boolean,  | 
| xy.pvalue | numeric vector of length 2, coordinates where to display the p-value of the log-rank test | 
| conf.int | boolean,  | 
| times.print | numeric vector, times at which to display the numbers of subjects at risk | 
| nrisk.labels | character vector to modify the levels of  | 
| legend | character string ( | 
| xlab | character string, label of the time axis | 
| ylab | character string, label of the y axis | 
| ylim | numeric vector of length 2, minimum and maximum of the y-axis | 
| left | integer, size of left margin | 
| bottom | integer, number of lines in addition of the table below the graph | 
| cex.mtext | numeric, size of the numbers of subjects at risk | 
| lwd | width of the Kaplan-Meier curve(s) | 
| lty | type of the Kaplan-Meier curve(s) | 
| col | color(s) of the Kaplan-Meier curve(s) | 
| ... | other arguments to be passed in  | 
Value
None
Author(s)
Hugo Varet
Examples
cgd$time=cgd$tstop-cgd$tstart
plot_km(Surv(time,status)~sex,data=cgd,col=c("blue","red"))
Spaghetti plot and plot of the mean at each time
Description
Spaghetti plot and plot of the mean at each time
Usage
plot_mm(
  formula,
  data,
  col.spag = 1,
  col.mean = 1,
  type = "spaghettis",
  tick.times = TRUE,
  xlab = NULL,
  ylab = NULL,
  main = "",
  lwd.spag = 1,
  lwd.mean = 4,
  ...
)
Arguments
| formula | 
 | 
| data | data frame in which we can find  | 
| col.spag | vector of length  | 
| col.mean | vector of length  | 
| type | 
 | 
| tick.times | boolean,  | 
| xlab | character sring, label of the time axis | 
| ylab | character string, label of the y axis | 
| main | character string, main title | 
| lwd.spag | numeric, width of the spaghetti lines, 1 by default | 
| lwd.mean | numeric, width of the mean lines, 4 by default | 
| ... | Other arguments to be passed in  | 
Value
None
Author(s)
Hugo Varet on Anais Charles-Nelson's idea
Examples
N=10
time=rep(1:4,N)
obs=1.1*time + rep(0:1,each=2*N) + rnorm(4*N)
my.data=data.frame(id=rep(1:N,each=4),time,obs,group=rep(1:2,each=N*2))
par(xaxs="i",yaxs="i")
plot_mm(obs~time+(group|id),my.data,col.spag=my.data$group,
        col.mean=c("blue","red"),type="both",main="Test plot_mm")
Plot a multi cross table
Description
Plots a multi cross table on a graph
Usage
plot_multi.table(data, vars, main = "")
Arguments
| data | the data frame in which we can find  | 
| vars | vector of character string of covariates | 
| main | main title of the plot | 
Value
None
Author(s)
Hugo Varet
See Also
Examples
plot_multi.table(cgd,c("treat","sex","inherit"))
Plot points with the corresponding linear regression line
Description
Plots points with the corresponding linear regression line
Usage
plot_reg(x, y, pch = 19, xlab = NULL, ylab = NULL, ...)
Arguments
| x | numeric vector | 
| y | numeric vector | 
| pch | type of points | 
| xlab | character string, label of the x axis,  | 
| ylab | character string, label of the y axis,  | 
| ... | other arguments to be passed in  | 
Value
None
Author(s)
Hugo Varet
Examples
plot_reg(cgd$age, cgd$height, xlab="Age (years)", ylab="Height")