
surreal implements the “Residual (Sur)Realism” algorithm
described by Stefanski (2007). This package allows you to generate
datasets that reveal hidden images or messages in their residual plots,
providing a novel approach to understanding and illustrating statistical
concepts.
You can install the development version of surreal from GitHub with:
# install.packages("remotes")
remotes::install_github("coatless-rpkg/surreal")First, load the package:
library(surreal)We can take an image with x and y
coordinate positions for pixels and embed it into the residual plot.
As an example, let’s use the built-in R logo dataset:
data("r_logo_image_data", package = "surreal")
plot(r_logo_image_data, pch = 16, main = "Original R Logo Data")
The data is in a 2D format:
str(r_logo_image_data)
#> 'data.frame':    2000 obs. of  2 variables:
#>  $ x: int  54 55 56 57 58 59 34 35 36 49 ...
#>  $ y: int  -9 -9 -9 -9 -9 -9 -10 -10 -10 -10 ...
summary(r_logo_image_data)
#>        x                y         
#>  Min.   :  5.00   Min.   :-75.00  
#>  1st Qu.: 32.00   1st Qu.:-57.00  
#>  Median : 57.00   Median :-39.00  
#>  Mean   : 55.29   Mean   :-40.48  
#>  3rd Qu.: 77.00   3rd Qu.:-24.00  
#>  Max.   :100.00   Max.   : -9.00Now, let’s apply the surreal method:
set.seed(114)
transformed_data <- surreal(r_logo_image_data)The transformation adds predictors that appear to have no underlying patterns:
pairs(y ~ ., data = transformed_data, main = "Data After Transformation")
Fit a linear model to the transformed data and plot the residuals:
model <- lm(y ~ ., data = transformed_data)
plot(model$fitted, model$resid, pch = 16, 
     main = "Residual Plot: Hidden R Logo Revealed")
The residual plot reveals the original R logo with a slight border, enhancing the image recovery.
You can also create datasets with custom hidden images or text. Here’s a quick example using text:
text_data <- surreal_text("R\nis\nawesome!")
model <- lm(y ~ ., data = text_data)
plot(model$fitted, model$resid, pch = 16, main = "Custom Text in Residuals")
Stefanski, L. A. (2007). “Residual (Sur)realism”. The American Statistician, 61(2), 163-177. doi:10.1198/000313007X190079
This package builds upon the work of John Staudenmayer, Peter Wolf, and Ulrike Gromping, who initially brought these algorithms to R.