Type: | Package |
Title: | Spatial Prediction Tools |
Description: | Methods for spatial predictive modeling, especially for spatial distribution models. This includes algorithms for model fitting and prediction, as well as methods for model evaluation. |
Version: | 0.1-19 |
Date: | 2025-05-06 |
Depends: | R (≥ 3.5.0), methods, terra |
Encoding: | UTF-8 |
Suggests: | disdat, rJava |
Maintainer: | Robert J. Hijmans <r.hijmans@gmail.com> |
License: | GPL (≥ 3) |
LazyLoad: | yes |
URL: | https://rspatial.org/sdm/ |
BugReports: | https://github.com/rspatial/predicts/issues/ |
NeedsCompilation: | no |
Packaged: | 2025-05-06 12:24:09 UTC; rhijm |
Author: | Robert J. Hijmans |
Repository: | CRAN |
Date/Publication: | 2025-05-06 13:10:02 UTC |
Spatial prediction
Description
This package implements functions for spatial predictions methods, especially spatial (species) distribution models, including an R link to the 'maxent' model.
Author(s)
Robert J. Hijmans
MaxEnt
Description
Build a "maxent" (Maximum Entropy) species distribution model (see references below). The function uses environmental data for locations of known presence and for a large number of 'background' locations. Environmental data can be extracted from raster files. The result is a model object that can be used to predict the suitability of other locations, for example, to predict the entire range of a species.
Background points are sampled randomly from the cells that are not NA
in the first predictor variable, unless background points are specified with argument a
.
This function uses the MaxEnt species distribution model software by Phillips, Dudik and Schapire.
Usage
## S4 method for signature 'SpatRaster,SpatVector'
MaxEnt(x, p, a=NULL, removeDuplicates=TRUE, nbg=10000, ...)
## S4 method for signature 'data.frame,numeric'
MaxEnt(x, p, args=NULL, path, silent=FALSE, ...)
## S4 method for signature 'missing,missing'
MaxEnt(x, p, silent=FALSE, ...)
Arguments
x |
Predictors. Either a SpatRaster to extract values from for the locations in |
p |
If If |
a |
Background points. Only used if |
nbg |
Number of background points to use. These are sampled randomly from the cells that are not |
args |
character. Additional argument that can be passed to MaxEnt. See the MaxEnt help for more information. The R MaxEnt function only uses the arguments relevant to model fitting. There is no point in using args='outputformat=raw' when *fitting* the model; but you can use arguments relevant for *prediction* when using the predict function. Some other arguments do not apply at all to the R implementation. An example is 'outputfiletype', because the 'predict' function has its own 'filename' argument for that |
removeDuplicates |
Boolean. If |
path |
character. Optional argument to set where you want the MaxEnt output files to be stored. This allows you to permanently keep these files. If not supplied the MaxEnt files will be stored in a temporary file. These are the files that are shown in a browser when typing the model name or when you use "show(model)" |
silent |
Boolean. If |
... |
Additional arguments |
Value
An object of class 'MaxEnt_model'. Or a 'MaxEnt_model_replicates' object if you use 'replicates=' as part of the args
argument.
If the function is run without any arguments a boolean value is returned (TRUE
if MaxEnt.jar was found).
Author(s)
Steven Phillips and Robert J. Hijmans
References
Steven J. Phillips, Miroslav Dudik, Robert E. Schapire, 2004. A maximum entropy approach to species distribution modeling. Proceedings of the Twenty-First International Conference on Machine Learning. p. 655-662.
Steven J. Phillips, Robert P. Anderson, Robert E. Schapire, 2006. Maximum entropy modeling of species geographic distributions. Ecological Modelling 190:231-259.
Jane Elith, Steven J. Phillips, Trevor Hastie, Miroslav Dudik, Yung En Chee, Colin J. Yates, 2011. A statistical explanation of MaxEnt for ecologists. Diversity and Distributions 17:43-57. doi:10.1111/j.1472-4642.2010.00725.x
See Also
Examples
# test the MaxEnt version
MaxEnt()
# get predictor variables
ff <- list.files("tif$", path=system.file("ex", package="predicts"), full.names=TRUE)
preds <- rast(ff)
plot(preds)
# file with presence points
occurence <- system.file("/ex/bradypus.csv", package="predicts")
occ <- read.csv(occurence)[,-1]
# witholding a 20% sample for testing
fold <- folds(occ, k=5)
occtest <- occ[fold == 1, ]
occtrain <- occ[fold != 1, ]
# fit model
me <- MaxEnt(preds, occtrain)
# see the MaxEnt results in a browser:
me
# use "args"
me2 <- MaxEnt(preds, occtrain, args=c("-J", "-P"))
# plot showing importance of each variable
plot(me)
# predict to entire dataset
r <- predict(me, preds)
# with some options:
r <- predict(me, preds, args=c("outputformat=raw"))
plot(r)
points(occ)
#testing
# background sample
bg <- backgroundSample(preds, 1000)
#simplest way to use 'evaluate'
e1 <- pa_evaluate(me, p=occtest, a=bg, x=preds)
# alternative 1
# extract values
pvtest <- data.frame(extract(preds, occtest))
avtest <- data.frame(extract(preds, bg))
e2 <- pa_evaluate(me, p=pvtest, a=avtest)
# alternative 2
# predict to testing points
testp <- predict(me, pvtest)
head(testp)
testa <- predict(me, avtest)
e3 <- pa_evaluate(p=testp, a=testa)
e3
threshold(e3)
plot(e3, 'ROC')
Root Mean Square Error
Description
Compute the Root Mean Square Error (RMSE)
Usage
RMSE(obs, prd, na.rm=FALSE)
RMSE_null(obs, prd, na.rm=FALSE)
Arguments
obs |
observed values |
prd |
predicted values |
na.rm |
logical. If |
Value
numeric
Class "SDM"
Description
Parent class for a number of models defined in the predicts package. This is a virtual Class, no objects may be direclty created from it.
Random points
Description
Generate random points that can be used to extract background values ("random-absence"). The points are sampled (without replacement) from the cells that are not 'NA
' in raster 'mask
'.
If the coordinate reference system (of mask
) is longitude/latitude, sampling is weighted by the size of the cells. That is, because cells close to the equator are larger than cells closer to the poles, equatorial cells have a higher probability of being selected.
Usage
backgroundSample(mask, n, p, ext=NULL, extf=1.1, excludep=TRUE,
cellnumbers=FALSE, tryf=3, warn=2)
Arguments
mask |
SpatRaster. If the object has cell values, cells with |
n |
integer. Number of points |
p |
Presence points (if provided, random points won't be in the same cells (as defined by mask) |
ext |
SpatExtent. Can be used to restrict sampling to a spatial extent |
extf |
numeric. Multiplyer to adjust the size of extent 'ext'. The default increases of 1.1 increases the extent a little (5% at each side of the extent) |
excludep |
logical. If |
cellnumbers |
logical. If |
tryf |
numeric > 1. Multiplyer used for initial sample size from which the requested sample size is extracted after removing NA points (outside of mask) |
warn |
integer. 2 or higher gives most warnings. 0 or lower gives no warnings if sample size |
Value
matrix with coordinates, or, if cellnumbers=TRUE
, a vector with cell numbers.
bioclimatic variables
Description
Function to create 'bioclimatic variables' from monthly climate data.
Usage
## S4 method for signature 'SpatRaster,SpatRaster,SpatRaster'
bcvars(prec, tmin, tmax, filename="", ...)
## S4 method for signature 'numeric,numeric,numeric'
bcvars(prec, tmin, tmax)
## S4 method for signature 'matrix,matrix,matrix'
bcvars(prec, tmin, tmax)
Arguments
prec |
numeric vector (12 values), matrix (12 columns), or SpatRaster with monthly (12 layers) precipitation data |
tmin |
same as |
tmax |
same as |
filename |
character. Output filename |
... |
additional arguments for writing files as in |
Details
Input data is normally monthly. I.e. there should be 12 values (layers) for each variable, but the function should also work for e.g. weekly data (with some changes in the meaning of the output variables. E.g. #8 would then not be for a quarter (3 months), but for a 3 week period).
Value
Same class as input, but 19 values/variables
bio1 = Mean annual temperature
bio2 = Mean diurnal range (mean of max temp - min temp)
bio3 = Isothermality (bio2/bio7) (* 100)
bio4 = Temperature seasonality (standard deviation *100)
bio5 = Max temperature of warmest month
bio6 = Min temperature of coldest month
bio7 = Temperature annual range (bio5-bio6)
bio8 = Mean temperature of the wettest quarter
bio9 = Mean temperature of driest quarter
bio10 = Mean temperature of warmest quarter
bio11 = Mean temperature of coldest quarter
bio12 = Total (annual) precipitation
bio13 = Precipitation of wettest month
bio14 = Precipitation of driest month
bio15 = Precipitation seasonality (coefficient of variation)
bio16 = Precipitation of wettest quarter
bio17 = Precipitation of driest quarter
bio18 = Precipitation of warmest quarter
Examples
tmin <- c(10,12,14,16,18,20,22,21,19,17,15,12)
tmax <- tmin + 5
prec <- c(0,2,10,30,80,160,80,20,40,60,20,0)
bcvars(prec, tmin, tmax)
tmn <- tmx <- prc <- rast(nrow=1, ncol=1, nlyr=12)
values(tmn) <- t(matrix(c(10,12,14,16,18,20,22,21,19,17,15,12)))
tmx <- tmn + 5
values(prc) <- t(matrix(c(0,2,10,30,80,160,80,20,40,60,20,0)))
b <- bcvars(prc, tmn, tmx)
as.matrix(b)
Model evaluation with a confusion matrix
Description
Get model evaluation statistics from a confusion matrix. This is useful when predicting (multiple) classes.
Usage
cm_evaluate(cmat, stat="overall")
Arguments
cmat |
confusion matrix. Normally created with table (see examples) |
stat |
character. Either "overall" (overall accuracy), "kappa", "class" for user and producer accuracy |
Value
numeric
See Also
Examples
classes <- c("forest", "water", "urban", "agriculture")
set.seed(1)
observed <- sample(classes, 100, replace=TRUE)
predicted <- observed
i <- seq(1,100,2)
predicted[i] <- sample(classes, length(i), replace=TRUE)
conmat <- table(observed, predicted)
conmat
cm_evaluate(conmat, "kappa")
cm_evaluate(conmat, "class")
Divide polygons into equal area parts
Description
stripper
divides polygons into horizontal or vertical strips of a specified relative size.
divider
divides a SpatVector
of polygons into n
compact and approximately equal area parts. The results are not deterministic so you should use set.seed
to be able to reproduce your results. If you get a warning about non-convergence, you can increase the number of iterations used with additional argument iter.max
Usage
divider(x, n, env=NULL, alpha=1, ...)
stripper(x, f=c(1/3, 2/3), vertical=TRUE)
Arguments
x |
SpatVector of polygons |
n |
positive integer. The number of parts requested |
env |
SpatRaster with environmental data |
alpha |
numeric. One or two numbers that act as weights for the x and y coordinates |
... |
additional arguments such as |
f |
numeric vector of fractions. These must be > 0 and < 1, and in ascending order |
vertical |
logical. If |
Value
SpatVector
Author(s)
stripper was derived from a function by Barry Rowlingson
Examples
f <- system.file("ex/lux.shp", package="terra")
v <- aggregate(vect(f))
set.seed(33)
d1 <- divider(v, 10)
plot(d1)
d2 <- divider(v, 100)
boxplot(expanse(d2, "km"))
x <- stripper(v, seq(0.1, 0.9, 0.1))
round(expanse(x,"km"), 1)
plot(x, col=rainbow(12))
Fit a (climate) envelope model and make predictions
Description
The envelope algorithm has been extensively used for species distribution modeling under the name "bioclim model". This is the classic 'climate-envelope-model' that started what was later called species distribution modeling and ecological niche modeling. Although it generally does not perform as good as some other methods (Elith et al. 2006) and is unsuited for predicting climate change effects (Hijmans and Graham, 2006). It may be useful in certain cases, among other reasons because the algorithm is easy to understand and thus useful in teaching species distribution modeling.
The algorithm computes the similarity of a location by comparing the values of environmental variables at any location to a percentile distribution of the values at known locations of occurrence ('training sites'). The closer to the 50th percentile (the median), the more suitable the location is. The tails of the distribution are not distinguished, that is, 10 percentile is treated as equivalent to 90 percentile.
In this R implementation, percentile scores are between 0 and 1, but predicted values larger than 0.5 are subtracted from 1. Then, the minimum percentile score across all the environmental variables is computed (i.e. this is like Liebig's law of the minimum, except that high values can also be limiting factors). The final value is subtracted from 1 and multiplied with 2 so that the results are between 0 and 1. The reason for this transformation is that the results become more like that of other distribution modeling methods and are thus easier to interpret. The value 1 will rarely be observed as it would require a location that has the median value of the training data for all the variables considered. The value 0 is very common as it is assigned to all cells with a value of an environmental variable that is outside the percentile distribution (the range of the training data) for at least one of the variables.
When using the predict function you can choose to ignore one of the tails of the distribution (for example, to make low rainfall a limiting factor, but not high rainfall).
Usage
envelope(x, ...)
Arguments
x |
matrix or data.frame where each column is an environmental variable and each row an occurrence. Alternatively, a SpatRaster where each layer is an environmental variable, in which case you must also provide argument 'p': a SpatVector of the occurrence points, or a data.frame of their spatial coordinates. Only grid cells that overlap with occurrences are used. |
... |
Additional arguments |
Value
An object of class 'envelope_model'
Author(s)
Robert J. Hijmans
References
Nix, H.A., 1986. A biogeographic analysis of Australian elapid snakes. In: Atlas of Elapid Snakes of Australia. (Ed.) R. Longmore, pp. 4-15. Australian Flora and Fauna Series Number 7. Australian Government Publishing Service: Canberra.
Booth, T.H., H.A. Nix, J.R. Busby and M.F. Hutchinson, 2014. BIOCLIM: the first species distribution modelling package, its early applications and relevance to most current MAXENT studies. Diversity and Distributions 20: 1-9
Elith, J., C.H. Graham, R.P. Anderson, M. Dudik, S. Ferrier, A. Guisan, R.J. Hijmans, F. Huettmann, J. Leathwick, A. Lehmann, J. Li, L.G. Lohmann, B. Loiselle, G. Manion, C. Moritz, M. Nakamura, Y. Nakazawa, J. McC. Overton, A.T. Peterson, S. Phillips, K. Richardson, R. Scachetti-Pereira, R. Schapire, J. Soberon, S. Williams, M. Wisz and N. Zimmerman, 2006. Novel methods improve prediction of species' distributions from occurrence data. Ecography 29: 129-151. doi:10.1111/j.2006.0906-7590.04596.x
Hijmans R.J., and C.H. Graham, 2006. Testing the ability of climate envelope models to predict the effect of climate change on species distributions. Global change biology 12: 2272-2281. doi:10.1111/j.1365-2486.2006.01256.x
See Also
Examples
# file with presence points
fsp <- system.file("/ex/bradypus.csv", package="predicts")
occ <- read.csv(fsp)[,-1]
#predictors
f <- system.file("ex/bio.tif", package="predicts")
preds <- rast(f)[[c(1,7,9)]]
v <- extract(preds, occ)
bc <- envelope(v[,-1])
d <- preds[18324:18374]
predict(bc, d)
p1 <- predict(bc, preds)
p2 <- predict(bc, preds, tails=c("both", "low", "high"))
Make folds for k-fold partitioning
Description
k-fold partitioning of a data set for model testing purposes. Each record in a matrix (or similar data structure) is randomly assigned to a group. Group numbers are between 1 and k
. The function assures that each fold has the same size (or as close to that as possible).
Usage
folds(x, k=5, by)
Arguments
x |
a vector, matrix, data.frame, or Spatial object |
k |
number of groups |
by |
Optional argument. A vector or factor with sub-groups (e.g. species). Its length should be the same as the number of records in x |
Value
a vector with group assignments
Author(s)
Robert J. Hijmans
Examples
library(disdat)
train <- disPo("NSW")
## a single species
srsp1 <- subset(train, spid=="nsw01")
folds(srsp1, k = 5)
## all species
k = folds(train, k=5, by=train$spid)
## each group has the same number of records
##(except for adjustments if the number of records
## divided by k is not an integer)
table(k[train$spid=="nsw01"])
hull model
Description
The hull model predicts that a species is present at sites inside the a hull that contains the training points, and is absent outside that circle.
The hull can be "convex", "circle", or "rectangle"
Usage
## S4 method for signature 'SpatVector'
hullModel(p, type="convex", n=1)
## S4 method for signature 'data.frame'
hullModel(p, type="convex", crs="", n=1)
## S4 method for signature 'matrix'
hullModel(p, type="convex", crs="", n=1)
Arguments
p |
point locations (presence). Two column matrix, data.frame or SpatVector |
type |
character. The type of hull. One of "convex", "circle", or "rectangle" |
crs |
character. The coordinate reference system |
n |
positive integer. The number of hulls to make |
Value
hull_model
Examples
r <- rast(system.file("ex/logo.tif", package="terra"))
#presence data
pts <- matrix(c(17, 42, 85, 70, 19, 53, 26, 84, 84, 46, 48, 85, 4,
95, 48, 54, 66, 74, 50, 48, 28, 73, 38, 56, 43, 29, 63, 22, 46, 45,
7, 60, 46, 34, 14, 51, 70, 31, 39, 26), ncol=2)
train <- pts[1:12, ]
test <- pts[13:20, ]
ch <- hullModel(train, crs="+proj=longlat")
predict(ch, test)
plot(r)
plot(ch, border="red", lwd=2, add=TRUE)
points(train, col="red", pch=20, cex=2)
points(test, col="black", pch=20, cex=2)
pr <- predict(ch, r)
plot(pr)
points(test, col="black", pch=20, cex=2)
points(train, col="red", pch=20, cex=2)
# to get the polygons:
p <- geometry(ch)
p
Multivariate environmental similarity surfaces (MESS)
Description
Compute multivariate environmental similarity surfaces (MESS), as described by Elith et al., 2010
Usage
## S4 method for signature 'SpatRaster'
mess(x, v, full=FALSE, filename="", ...)
## S4 method for signature 'data.frame'
mess(x, v, full=FALSE)
Arguments
x |
SpatRaster or data.frame |
v |
matrix or data.frame containing the reference values; each column should correspond to one layer of the SpatRaster object. If |
full |
logical. If |
filename |
character. Output filename (optional) |
... |
additional arguments as for |
Details
v
can be obtained for a set of points using extract
.
Value
SpatRaster (or data.frame) with layers (columns) corresponding to the input layers and an additional layer with the mess values (if full=TRUE
and nlyr(x) > 1
) or a SpatRaster (data.frame) with the MESS values (if full=FALSE
).
Author(s)
Jean-Pierre Rossi, Robert Hijmans, Paulo van Breugel
References
Elith J., M. Kearney M., and S. Phillips, 2010. The art of modelling range-shifting species. Methods in Ecology and Evolution 1:330-342. doi:10.1111/j.2041-210X.2010.00036.x
Examples
set.seed(9)
r <- rast(ncol=10, nrow=10)
r1 <- setValues(r, (1:ncell(r))/10 + rnorm(ncell(r)))
r2 <- setValues(r, (1:ncell(r))/10 + rnorm(ncell(r)))
r3 <- setValues(r, (1:ncell(r))/10 + rnorm(ncell(r)))
s <- c(r1,r2,r3)
names(s) <- c('a', 'b', 'c')
xy <- cbind(rep(c(10,30,50), 3), rep(c(10,30,50), each=3))
refpt <- extract(s, xy)
ms <- mess(s, refpt, full=TRUE)
plot(ms)
## Not run:
filename <- paste0(system.file(package="predicts"), "/ex/bradypus.csv")
bradypus <- read.table(filename, header=TRUE, sep=',')
bradypus <- bradypus[,2:3]
predfile <- paste0(system.file(package="predicts"), "/ex/bio.tif")
predictors <- rast(predfile)
reference_points <- extract(predictors, bradypus, ID=FALSE)
mss <- mess(x=predictors, v=reference_points, full=TRUE)
breaks <- c(-500, -50, -25, -5, 0, 5, 25, 50, 100)
fcol <- colorRampPalette(c("blue", "beige", "red"))
plot(mss[[10]], breaks=breaks, col=fcol(9), plg=list(x="bottomleft"))
## End(Not run)
Presence/absence Model evaluation
Description
Evaluation of models with presence/absence data. Given a vector of presence and a vector of absence values, confusion matrices are computed for a sequence of thresholds, and model evaluation statistics are computed for each confusion matrix / threshold.
Usage
pa_evaluate(p, a, model=NULL, x=NULL, tr, ...)
Arguments
p |
either (1) predictions for presence points ( |
a |
as above for absence or background points |
model |
A fitted model used to make predictions |
x |
SpatRaster used to extract predictor values from |
tr |
Optional. a vector of threshold values to use for computing the confusion matrices |
... |
Additional arguments passed on to |
Value
pa_ModelEvaluation object
details
A pa_ModelEvaluation object has the the following slots
presence
:presence values used
absence
:absence values used
confusion
:confusion matrix for each threshold
stats
:statistics that are not threshold dependent
tr_stats
:statistics that are threshold dependent
thresholds
:optimal thresholds to classify values into presence and absence
stats
has the following values
np
:number of presence points
na
:number of absence points
auc
:Area under the receiver operator (ROC) curve
pauc
:p-value for the AUC (for the Wilcoxon test W statistic
cor
:Correlation coefficient
pcor
:p-value for correlation coefficient
prevalence
:Prevalence
ODP
:Overall diagnostic power
tr_stats
has the following values
tresholds
:vector of thresholds used to compute confusion matrices
CCR
:Correct classification rate
TPR
:True positive rate
TNR
:True negative rate
FPR
:False positive rate
FNR
:False negative rate
PPP
:Positive predictive power
NPP
:Negative predictive power
MCR
:Misclassification rate
OR
:Odds-ratio
kappa
:Cohen's kappa
thresholds
has the following values
max_kappa
:the threshold at which kappa is highest
max_spec_sens
:the threshold at which the sum of the sensitivity (true positive rate) and specificity (true negative rate) is highest
no_omission
:the highest threshold at which there is no omission
prevalence
:modeled prevalence is closest to observed prevalence
equal_sens_spec
:equal sensitivity and specificity
References
Fielding, A.H. and J.F. Bell, 1997. A review of methods for the assessment of prediction errors in conservation presence/absence models. Environmental Conservation 24:38-49
Liu, C., M. White & G. Newell, 2011. Measuring and comparing the accuracy of species distribution models with presence-absence data. Ecography 34: 232-243.
See Also
Examples
set.seed(0)
# p has the predicted values for 50 known cases (locations)
# with presence of the phenomenon (species)
p <- rnorm(50, mean=0.6, sd=0.3)
# a has the predicted values for 50 background locations (or absence)
a <- rnorm(50, mean=0.4, sd=0.4)
e <- pa_evaluate(p=p, a=a)
e
e@stats
plot(e, "ROC")
plot(e, "TPR")
plot(e, "boxplot")
plot(e, "density")
str(e)
Get partial response data
Description
Get partial response data.
Usage
partialResponse(model, data, var=NULL, rng=NULL, nsteps=25, plot=TRUE, nr, nc, ...)
partialResponse2(model, data, var1, var2, var2levels, rng=NULL, nsteps=25, ...)
Arguments
model |
a model object |
data |
data.frame with data for all model variables |
var |
character or positive integer to identify the variable(s) of interest in |
var1 |
character or positive integer to identify the variable of interest in |
var2 |
character. A second variable of interest |
var2levels |
character. The levels of the second variable to consider |
rng |
optional vector of two numbers to set the range or the variable |
nsteps |
positive integer. Number of steps to consider for the variable |
plot |
logical. If |
nc |
positive integer. Optional. The number of columns to divide the plotting device in (when plotting multiple variables) |
nr |
positive integer. Optional. The number of rows to divide the plotting device in (when plotting multiple variables) |
... |
model specific additional arguments passed to |
Value
list (invisible if plot=TRUE
)
Examples
fsp <- system.file("/ex/bradypus.csv", package="predicts")
occ <- read.csv(fsp)[,-1]
f <- system.file("ex/bio.tif", package="predicts")
preds <- rast(f)[[c(1,7,9)]]
v <- extract(preds, occ, ID=FALSE)
bc <- envelope(v)
pr <- partialResponse(bc, data=v, var=c("bio1", "bio12"), nsteps=30)
str(pr)
Plot predictor values
Description
Plot predictor values for occurrence (presence and absence) data in a model object.
Usage
## S4 method for signature 'envelope_model,missing'
plot(x, a = 1, b = 2, p = 0.9, ocol="gray", icol="red", bcol="blue", cex=c(0.6, 0.6), ...)
## S4 method for signature 'MaxEnt_model,ANY'
plot(x, y, ...)
Arguments
x |
model object |
a |
name or position of the variable to plot in the x axis |
b |
name or position of the variable to plot in the y axis |
p |
percentile for coloring the points in the plot and delimiting the envelope rectangle |
ocol |
color of the points outside the envelope |
icol |
color of the points inside the envelope |
bcol |
color of the envelope border |
cex |
size of the points outside and inside the envelope |
y |
not used |
... |
additional arguments. Not used |
Spatial model predictions
Description
Make predictions with models defined in the predicts
package
Usage
## S4 method for signature 'MaxEnt_model'
predict(object, x, ext=NULL, args="", filename="", ...)
## S4 method for signature 'envelope_model'
predict(object, x, tails=NULL, ext=NULL, filename="", ...)
## S4 method for signature 'hull_model'
predict(object, x, ext=NULL, mask=FALSE, filename="", ...)
Arguments
object |
model defined in this package (e.g. "envelope_model" and "maxent_model") |
x |
data to predict to. Either a data.frame or a SpatRaster |
tails |
character. You can use this to ignore the left or right tail of the percentile distribution for a variable. If supplied, tails should be a character vector with a length equal to the number of variables used in the model. Valid values are "both" (the default), "low" and "high". For example, if you have a variable x with an observed distribution between 10 and 20 and you are predicting the bioclim value for a value of 25, the default result would be zero (outside of all observed values); but if you use tail='low', the high (right) tail is ignored and the value returned will be 1. |
args |
Pass *prediction* arguments (options) to the maxent software. See |
ext |
|
mask |
logical. If |
filename |
character. Output filename |
... |
additional arguments for writing files as in |
Value
SpatRaster or vector (if x
is a data.frame).
See Also
predict
function in the "terra" package for spatial predictions with glm, randomForest, etc.
Pair-wise distance sampling
Description
Select pairs of points from two sets (without replacement) that have a similar distance to their nearest point in another set of points.
For each point in "fixed
", a point is selected from "sample
" that has a similar distance (as defined by threshold
) to its nearest point in "reference
" (note that these are likely to be different points in reference
). The select point is either the nearest point nearest=TRUE
, or a randomly select point nearest=FALSE
that is within the threshold distance. If no point within the threshold distance is found in sample
, the point in fixed
is dropped.
Hijmans (2012) proposed this sampling approach to remove 'spatial sorting bias' from evaluation data used in cross-validation of presence-only species distribution models. In that context, fixed
are the testing-presence points, sample
the testing-absence (or testing-background) points, and reference
the training-presence points.
Usage
pwd_sample(fixed, sample, reference, tr=0.33, nearest=TRUE, n=1, lonlat=TRUE, warn=TRUE)
Arguments
fixed |
two column matrix (x, y) or (longitude/latitude) or SpatialPoints object, for point locations for which a pair should be found in |
sample |
as above for point locations from which to sample to make a pair with a point from |
reference |
as above for reference point locations to which distances are computed |
n |
How many pairs do you want for each point in |
tr |
Numeric, normally below 1. The threshold distance for a pair of points (one of |
nearest |
Logical. If |
lonlat |
Logical. Use |
warn |
Logical. If |
Value
A matrix of nrow(fixed) and ncol(n), that indicates, for each point (row) in fixed
which point(s) in sample
it is paired to; or NA
if no suitable pair was available.
References
Hijmans, R.J., 2012. Cross-validation of species distribution models: removing spatial sorting bias and calibration with a null-model. Ecology 93: 679-688
Examples
ref <- matrix(c(-54.5,-38.5, 2.5, -9.5, -45.5, 1.5, 9.5, 4.5, -10.5, -10.5), ncol=2)
fix <- matrix(c(-56.5, -30.5, -6.5, 14.5, -25.5, -48.5, 14.5, -2.5, 14.5,
-11.5, -17.5, -11.5), ncol=2)
r <- rast()
ext(r) <- c(-110, 110, -45, 45)
r[] <- 1
set.seed(0)
sam <- spatSample(r, 50, xy=TRUE, as.points=TRUE)
plot(sam, pch='x')
points(ref, col='red', pch=18, cex=2)
points(fix, col='blue', pch=20, cex=2)
i <- pwd_sample(fix, sam, ref, lonlat=TRUE)
i
sfix <- fix[!is.na(i), ]
ssam <- sam[i[!is.na(i)], ]
ssam
plot(sam, pch='x', cex=0)
points(ssam, pch='x')
points(ref, col='red', pch=18, cex=2)
points(sfix, col='blue', pch=20, cex=2)
# try to get 3 pairs for each point in 'fixed'
pwd_sample(fix, sam, ref, lonlat=TRUE, n=3)
Pycnophylactic interpolation.
Description
Given a SpatVector
of polygons and population data for each polygon, compute a population density estimate based on Tobler's pycnophylactic interpolation algorithm.
Usage
pycnophy(x, v, pop, r = 0.2, converge = 3, verbose=FALSE)
Arguments
x |
SpatRaster to interpolate to |
v |
SpatVector of polygons |
pop |
Either a character (name in |
r |
A relaxation parameter for the iterative step in the pycnophylactic algorithm. Prevents over-compensation in the smoothing step. In practice the default value works well |
converge |
A convergence parameter, informing the decision on when iterative improvements on the smooth surface have converged sufficiently - see details |
verbose |
If |
Details
This method uses an iterative approach, and for each iteration notes the maximum change in a pixel. When this value falls below a certain level (10^(-converge) times the largest initial grid cell value) the iteration stops.
Value
SpatRaster
Note
Pycnophylatic interpolation has the property that the sum of the estimated values associated with all of the pixels in any polygon equals the supplied population for that polygon. A further property is that all pixel values are greater than or equal to zero. The method is generally used to obtain pixel-based population estimates when total populations for a set of irregular polygons (eg. counties) are known.
Author(s)
Chris Brunsdon (adapted for terra objects by Robert Hijmans)
References
Tobler, W.R. (1979) Smooth Pycnophylactic Interpolation for Geographical Regions. Journal of the American Statistical Association, v74(367) pp. 519-530.
Examples
f <- system.file("ex/lux.shp", package="terra")
v <- vect(f)
r <- rast(v, resolution = 0.01)
p <- pycnophy(r, v, "POP", converge=3, verbose=FALSE)
plot(p); lines(v)
Find a threshold
Description
Find a threshold (cut-off) to transform model predictions (probabilities, distances, or similar values) to a binary score (presence or absence).
Usage
## S4 method for signature 'paModelEvaluation'
threshold(x)
Arguments
x |
paModelEvaluation object (see |
Value
data.frame with the following columns:
kappa: the threshold at which kappa is highest ("max kappa")
spec_sens: the threshold at which the sum of the sensitivity (true positive rate) and specificity (true negative rate) is highest
no_omission: the highest threshold at which there is no omission
prevalence: modeled prevalence is closest to observed prevalence
equal_sens_spec: equal sensitivity and specificity
Author(s)
Robert J. Hijmans and Diego Nieto-Lugilde
See Also
Examples
## See ?maxent for an example with real data.
# this is a contrived example:
# p has the predicted values for 50 known cases (locations)
# with presence of the phenomenon (species)
p <- rnorm(50, mean=0.7, sd=0.3)
# b has the predicted values for 50 background locations (or absence)
a <- rnorm(50, mean=0.4, sd=0.4)
e <- pa_evaluate(p=p, a=a)
threshold(e)
Get variable importance
Description
Get variable importance. The importance is expressed as the deterioration of the evaluation statistic. The statistic is computed n
times for model predictions after randomizing a predictor variable and subtracting the statistic for the non-randomized data. The larger the difference, the more important the variable is.
Usage
varImportance(model, y, x, n=10, stat, value="relative", ...)
Arguments
model |
a model object |
y |
the response variable used to fit the |
x |
data.frame with the predictor variables used to fit the |
n |
positive integer. Number of simulations |
stat |
character. For models with a continuous response variable this can be one of "RMSE" (the default), "AUC", or "cor". See |
value |
character specifying how to express the output. One of , "relative" (), "difference" (), "absolute" (no adjustments) |
... |
model specific additional arguments passed to |
Value
named numeric vector
Examples
set.seed(1)
d <- data.frame(y=1:10, x1=runif(10), x2=runif(10))
m <- lm(y~., data=d)
varImportance(m, d[,1], d[,2:3])