---
title: "Valid Prediction"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Valid Prediction}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

This article provides more detail on how **bases** ensures valid prediction,
i.e., prevents any data leakage when new predictions are made.

Every basis function provides the `makepredictcall()` generic, which is called
by `model.frame()` and whose job it is to save any statistics used by the
basis expansion (such as a set of randomly sampled frequencies and phase shifts)
for reuse later.

Basis functions support the `predict()` generic, so that if they are called
outside of a model formula, they can be updated with new data.
Behind the scenes, `predict()` for the various basis functions is just a 
small wrapper around `makepredictcall()`.

To demonstrate these points, we will use the `b_rff()` basis function, which
uses random features.
However, the features are sampled once on construction and then retained for
further use.
First, in the modeling context, we'll fit a model with `b_rff()` in the formula.

```{r setup}
library(bases)
data(mtcars)

m = lm(mpg ~ b_rff(cyl, disp, hp, wt, p = 10), mtcars)
```

Repeated calls to `predict()` will yield the same predictions, even if the
`newdata` argument is not empty.

```{r}
all.equal(predict(m), predict(m, newdata = mtcars))
all.equal(predict(m, newdata = mtcars[5:10, ]), 
          predict(m, newdata = mtcars[5:10, ]))
```

The same is true if `b_rff()` is used outside of a formula.

```{r}
B = with(mtcars, b_rff(cyl, disp, hp, wt, p = 10))

all.equal(B, predict(B))
all.equal(B, predict(B, newdata = mtcars), check.attributes = FALSE)
nrow(predict(B, newdata = mtcars[1:3, ]))
```