Type: Package
Title: Score Test Integrated with Empirical Bayes for Association Study
Version: 0.1.1
Author: Wenlong Ren
Maintainer: Wenlong Ren <wenlongren@ntu.edu.cn>
Description: Perform association test within linear mixed model framework using score test integrated with Empirical Bayes for genome-wide association study. Firstly, score test was conducted for each marker under linear mixed model framework, taking into account the genetic relatedness and population structure. And then all the potentially associated markers were selected with a less stringent criterion. Finally, all the selected markers were placed into a multi-locus model to identify the true quantitative trait nucleotide.
License: GPL-3
Imports: data.table
Encoding: UTF-8
LazyData: true
NeedsCompilation: yes
Depends: R (≥ 3.5.0)
RoxygenNote: 7.1.1
Packaged: 2021-09-15 15:40:45 UTC; ThinkPad
Repository: CRAN
Date/Publication: 2021-09-15 21:10:12 UTC

Preconditioned Conjugate Gradient

Description

Conduct preconditioned conjugate gradient method to accelerate.

Usage

PCG(G,b,m.marker,sigma.k2,sigma.e2,tol,miter)

Arguments

G

genotype data.

b

column vector.

m.marker

the number of markers.

sigma.k2

variance of polygenic.

sigma.e2

variance of residual error.

tol

convergence threshold.

miter

the maximum number of iterations.

Value

x

x is approximate solution of linear equations.

Examples

data(geno)
G <- t(geno[,-c(1:4)])
n.sample <- dim(G)[1]
m.marker <- dim(G)[2]
b <- rnorm(n.sample)
sigma.k2 <- 6.0
sigma.e2 <- 10.0
tol <- 5e-4
miter <- 20
PCG(G,b,m.marker,sigma.k2,sigma.e2,tol,miter)

Score Test Integrated with Empirical Bayes for Association Study

Description

Perform association test within linear mixed model framework using score test integrated with Empirical Bayes for genome-wide association study. Firstly, score test was conducted for each marker under linear mixed model framework, taking into account the genetic relatedness and population structure. And then all the potentially associated markers were selected with a less stringent criterion. Finally, all the selected markers were placed into a multi-locus model to identify the true quantitative trait nucleotide.

Usage

ScoreEB(genofile, phenofile, popfile = NULL, trait.num = 1, EMB.tau = 0,
EMB.omega = 0, B.Moment = 20, tol.pcg = 1e-4, iter.pcg = 100, bin = 100,
lod.cutoff = 3.0, seed.num = 10000, dir_out)

Arguments

genofile

Genotype file name, change the file path where it is located, i.e.,"D:/Genotype_Example.csv".

phenofile

Phenotype file name, change the file path where it is located, i.e.,"D:/Phenotype_Example.csv".

popfile

Population structure file name, change the file path where it is located,i.e.,"D:/Population.csv".

trait.num

trait.num stands for computing trait from the 1st to the "trait.num"

EMB.tau

EMB.tau and EMB.omega are two values of hyperparameters in empirical Bayes step, which are set to 0 by default.

EMB.omega

As describe in EMB.tau

B.Moment

B.Moment is a parameter to obtain trace of NxN matrix approximately using method of moment. B.Moment is set to 20 by default.

tol.pcg

tol.pcg and iter.pcg are tolerance and maximum iteration number in preconditioned conjugate gradient algorithm.

iter.pcg

As describe in tol.pcg

bin

bin is to choose the maximum score within a certain range.

lod.cutoff

lod.cutoff is the threshold to determine identified QTNs.

seed.num

Set a random number.

dir_out

Give the path where it will be saved,i.e.,"D:/Result"

Value

result.total

A data frame of identified markers, including "Trait", "Id", "Chr", "Pos", "Score", "Beta", "Lod" and "Pvalue" of markers.

Note

1. genofile and phenofile are the required input file, while popfile is the optional input file.
2. In the "tempdir()" folder, there are two results files "ScoreEB.Result.csv" and "ScoreEB.time.csv" generated and saved after the run.
3. The results file "ScoreEB.Result.csv" has 8 columns, including "Trait", "Id", "Chr", "Pos", "Score", "Beta", "Lod" and "Pvalue".
4. The time file "ScoreEB.time.csv" includes 3 rows, which are "User", "System", "Elapse" time, respectively.

Author(s)

Wenlong Ren
Wenlong Ren <wenlongren@ntu.edu.cn>

Examples

genofile <- system.file("extdata", "Genotype_Example.csv", package="ScoreEB")
phenofile <-system.file("extdata", "Phenotype_Example.csv", package="ScoreEB")
dir_out <- tempdir()
ScoreEB(genofile, phenofile, popfile = NULL, trait.num = 1, EMB.tau = 0,
EMB.omega = 0, B.Moment = 20, tol.pcg = 1e-4, iter.pcg = 100, bin = 100,
lod.cutoff = 3.0, seed.num = 10000, dir_out)

Empirical Bayes for multi-locus selection

Description

Empirical Bayes using expectation–maximization algorithm.

Usage

ebayes_EM(x,z,y,EMB.tau,EMB.omega)

Arguments

x

fixed effect vector or matrix.

z

genotype data.

y

phenotype data.

EMB.tau

one of hyperparameters in inverse chi-square distribution.

EMB.omega

one of hyperparameters in inverse chi-square distribution.

Value

u

The effect values of markers, and their absolute values are used as the basis for further screening.

Examples

data(geno)
data(pheno)
EMB.tau <- 0
EMB.omega <- 0
z <- t(geno[,-c(1:4)])
y <- as.matrix(pheno)
nsample <- dim(z)[1]
x <- as.matrix(rep(1,nsample))
ebayes_EM(x,z,y,EMB.tau,EMB.omega)

Genotype of example data

Description

Genotype dataset with SNP chromosome, position and etc.

Usage

data(geno)

Details

Dataset input of genotype in ScoreEB function.


Carry out likelihood ratio test

Description

Snps selected via EM-Bayes to further identified by likelihood ratio test.

Usage

likelihood(xxn,xxx,yn,bbo)

Arguments

xxn

fixed effect vector or matrix.

xxx

snp matrix which are selected by EM-Bayes.

yn

phenotype data.

bbo

effect value of snp estimated by EM-Bayes.

Value

lod

Odds of logarithm vector of markers.

Examples

data(geno)
data(pheno)
z <- t(geno[,-c(1:4)])
y <- as.matrix(pheno)
n.sample <- dim(z)[1]
m.marker <- dim(z)[2]
x <- as.matrix(rep(1,n.sample))
beta <- rnorm(m.marker)
likelihood(x,z,y,beta)

Multivariate normal distribution

Description

Obtain P value with multivariate normal distribution.

Usage

multinormal(y,mean,sigma)

Arguments

y

column vector.

mean

arithmetic mean.

sigma

standard deviation.

Value

pdf_value

A vector of multivariate normal distribution density function.

Examples

data(pheno)
y <- pheno
mean <- 2.0
sigma <- 1.5
multinormal(y,mean,sigma)

Phenotype of example data

Description

Phenotype dataset of multiple traits.

Usage

data(pheno)

Details

Dataset input of phenotype in ScoreEB function.