Type: | Package |
Title: | ML Estimation for Multivariate Normal Data with Missing Values |
Version: | 0.1-11.2 |
Description: | Finds the Maximum Likelihood (ML) Estimate of the mean vector and variance-covariance matrix for multivariate normal data with missing values. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
URL: | https://github.com/indenkun/mvnmle |
BugReports: | https://github.com/indenkun/mvnmle/issues |
Encoding: | UTF-8 |
LazyData: | true |
NeedsCompilation: | yes |
RoxygenNote: | 7.2.3 |
Imports: | stats |
Packaged: | 2023-02-25 16:07:14 UTC; kobayashi |
Author: | Kevin Gross |
Maintainer: | Mao Kobayashi <kobamao.jp@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2023-02-27 08:32:30 UTC |
Worm Infestations in Apple Crops
Description
The apple
data frame provides the number of apples (in 100s) on
18 different apple trees. For 12 trees, the percentage of apples with
worms (x 100) is also given.
Usage
apple
Format
This data frame contains the following columns:
- size
hundreds of apples on the tree.
- worms
percentage (x100) of apples harboring worms.
Source
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.
Cochran, W. G., and Snedecor, G. W. (1972) Statistical Methods, 6th ed. Ames: Iowa State University Press, ISBN:0813815606.
Examples
library(mvnmle)
data(apple)
mlest(apple)
Create likelihood function for multivariate data with missing values.
Description
getclf
returns a function proportional to twice the negative
log likelihood function for multivariate normal data with missing
values. This is a private function used in mlest
.
Usage
getclf(data, freq)
Arguments
data |
A data frame sorted so that records with identical patterns of missingness are grouped together. |
freq |
An integer vector specifying the number of records in each block of data with identical patterns of missingness. |
Details
The argument of the returned function is the vector of parameters. The parameterization is: mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom).
Value
A function proportional to twice the negative log likelihood of the parameters given the data.
References
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.
See Also
Obtain starting values for maximum likelihood estimation.
Description
Calculates the starting values to be passed to nlm
for
minimization of the negative log-likelihood for multivariate normal
data with missing values. This function is private to mlest
.
Usage
getstartvals(x, eps = 0.001)
Arguments
x |
Multivariate data, potentially with missing values. |
eps |
All eigenvalues of the variance-covariance matrix less than
|
Details
Starting values for the mean vector are simply sample means. Starting
values for the variance-covariance matrix are derived from the sample
variance-covariance matrix, after setting eigenvalues less than
eps
times the smallest positive eigenvalue equal to eps
times the smallest positive eigenvalue to enforce positive definiteness.
Value
A numeric vector, containing the mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor of the adjusted sample variance-covariance matrix, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom).
See Also
Make the upper triangular matrix del from a parameter vector
Description
make.del
takes a parameter vector of length k*(k+1)/2
and
returns the upper triangular k \times k
matrix \Delta
.
make.del
is a private function intended for use inside mlest
.
Usage
make.del(pars)
Arguments
pars |
A length |
Details
The first k
elements of pars
are the log of the diagonal
elements of \Delta
. The next k*(k-1)/2
elements are the
elements above the main diagonal of \Delta
, ordered by column
(left to right), and then by row within column (top to bottom). That
is to say, if \Delta_{ij}
is the element in the i
th row
and j
th column of \Delta
, then the order of the parameters
is \Delta_{11}, \Delta_{22}, \ldots, \Delta_{kk}, \Delta_{12},
\Delta_{13}, \Delta_{23}, \Delta_{14}, \ldots,\Delta_{(k-1)k}
.
Value
An upper triangular k \times k
matrix.
References
Pinheiro, J. C., and Bates, D. M. (2000) Mixed-effects models in S and S-PLUS. New York: Springer, ISBN:1441903178.
See Also
A multivariate data set with missing values.
Description
The missvals
data frame has 13 rows and 5 columns.
These are data from Draper and Smith (1966, ISBN:0471221708), and are included to
demonstrate Maximum Likelihood (ML) estimation of mean and variance-covariance parameters of
multivariate normal data when some observations are missing.
Usage
missvals
Format
This data frame contains the following columns:
- x1,x2,x3,x4,x5
-
numeric vectors
Source
Draper, N. R., and Smith, H. (1966) Applied Regression Analysis. New York: Wiley, ISBN:0471221708.
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.
Rubin, D. B. (1976) Comparing regressions when some predictor variables are missing. Psychometrika 43, 3–10, doi:10.2307/1267523.
Examples
library(mvnmle)
data(missvals)
mlest(missvals, iterlim = 400)
ML Estimation of Multivariate Normal Data
Description
Finds the Maximum Likelihood (ML) Estimates of the mean vector and variance-covariance matrix for multivariate normal data with (potentially) missing values.
Usage
mlest(data, ...)
Arguments
data |
A data frame or matrix containing multivariate normal data. Each row should correspond to an observation, and each column to a component of the multivariate vector. Missing values should be coded by 'NA'. |
... |
Optional arguments to be passed to the nlm optimization routine. |
Details
The estimate of the variance-covariance matrix returned by
mlest
is necessarily positive semi-definite. Internally,
nlm
is used to minimize the negative log-likelihood, so
optional arguments mayh be passed to nlm
which modify the
details of the minimization algorithm, such as iterlim
. The
likelihood is specified in terms of the inverse of the Cholesky factor
of the variance-covariance matrix (see Pinheiro and Bates (2000, ISBN:1441903178)).
mlest
cannot handle data matrices with more than 50 variables.
Each varaible must also be observed at least once.
Value
muhat |
Maximum Likelihood Estimation (MLE) of the mean vector. |
sigmahat |
MLE of the variance-covariance matrix. |
value |
The objective function that is minimized by |
gradient |
The curvature of the likelihood surface at the MLE, in the parameterization used internally by the optimization algorithm. This parameterization is: mean vector first, followed by the log of the diagonal elements of the inverse of the Cholesky factor, and then the elements of the inverse of the Cholesky factor above the main diagonal. These off-diagonal elements are ordered by column (left to right), and then by row within column (top to bottom). |
stop.code |
The stop code returned by |
iterations |
The number of iterations used by |
References
Little, R. J. A., and Rubin, D. B. (1987) Statistical Analysis with Missing Data. New York: Wiley, ISBN:0471802549.
Pinheiro, J. C., and Bates, D. M. (1996) Unconstrained parametrizations for variance-covariance matrices. Statistics and Computing 6, 289–296, doi:10.1007/BF00140873.
Pinheiro, J. C., and Bates, D. M. (2000) Mixed-effects models in S and S-PLUS. New York: Springer, ISBN:1441903178.
See Also
Examples
library(mvnmle)
data(apple)
mlest(apple)
data(missvals)
mlest(missvals, iterlim = 400)
Sort a multivariate data matrix according to patterns of missingness.
Description
mysort
sorts a multivariate data matrix so that records with
identical patterns of missingness are adjacent to one another.
mysort
is a private function used inside of mlest
.
Usage
mysort(x)
Arguments
x |
A multivariate data matrix. Rows correspond to individual records and columns correspond to components of the multivariate vector. |
Value
sorted.data |
A matrix of the same size as |
freq |
An integer vector giving the number of records in each
block of rows with a unique pattern of missingness. The first
element in |