Title: | Bayesian Essentials with R |
---|---|
Description: | Allows the reenactment of the R programs used in the book Bayesian Essentials with R without further programming. R code being available as well, they can be modified by the user to conduct one's own simulations. Marin J.-M. and Robert C. P. (2014) <doi:10.1007/978-1-4614-8687-9>. |
Authors: | Jean-Michel Marin [aut, cre], Christian P. Robert [aut] |
Maintainer: | Jean-Michel Marin <[email protected]> |
License: | GPL-2 |
Version: | 1.6 |
Built: | 2024-12-29 03:19:01 UTC |
Source: | https://github.com/jmm34/bayess |
This function is associated with Chapter 5 on capture-recapture model.
It simulates samples from the non-standard
distribution on , the number of individuals vanishing
between the first and second experiments, as expressed in (5.4)
in the book, conditional on
, the number of individuals vanishing
between the second and third experiments.
ardipper(nsimu, n1, c2, c3, r2, q1)
ardipper(nsimu, n1, c2, c3, r2, q1)
nsimu |
number of simulations |
n1 |
first capture sample size |
c2 |
number of individuals recaptured during the second experiment |
c3 |
number of individuals recaptured during the third experiment |
r2 |
number of individuals vanishing between the second and third experiments |
q1 |
probability of disappearing from the population |
A sample of nsimu
integers
ardipper(10,11,3,1,0,.1)
ardipper(10,11,3,1,0,.1)
This function is related to Chapter 6 on dynamical models.
It returns the numerical value of the log-likelihood associated with a time
series and an AR(p) model, along with the natural coefficients of the AR(p) model
if it is defined via the roots
lr
and lc
of the associated lag-polynomial.
The function thus uses either the natural parameterisation of the AR(p) model
or the parameterisation via the lag-polynomial roots
where .
ARllog(p,dat,pr, pc, lr, lc, mu, sig2, compsi = TRUE, pepsi = c(1, rep(0, p)))
ARllog(p,dat,pr, pc, lr, lc, mu, sig2, compsi = TRUE, pepsi = c(1, rep(0, p)))
p |
order of the AR |
dat |
time series modelled by the AR |
pr |
number of real roots |
pc |
number of non-conjugate complex roots |
lr |
real roots |
lc |
complex roots, stored as real part for odd indices and imaginary part for even indices |
mu |
drift coefficient |
sig2 |
variance of the Gaussian white noise |
compsi |
boolean variable indicating whether the coefficients |
pepsi |
potential p+1 coefficients |
ll |
value of the log-likelihood |
ps |
vector of the |
ARllog(p=3,dat=faithful[,1],pr=3,pc=0, lr=c(-.1,.5,.2),lc=0,mu=0,sig2=var(faithful[,1]),compsi=FALSE,pepsi=c(1,rep(.1,3)))
ARllog(p=3,dat=faithful[,1],pr=3,pc=0, lr=c(-.1,.5,.2),lc=0,mu=0,sig2=var(faithful[,1]),compsi=FALSE,pepsi=c(1,rep(.1,3)))
This function is associated with Chapter 6 on dynamic models. It implements a Metropolis–Hastings algorithm on the coefficients of the AR(p) model resorting to a simulation of the real and complex roots of the model. It includes jumps between adjacent numbers of real and complex roots, as well as random modifications for a given number of real and complex roots.
ARmh(x, p = 1, W = 10^3)
ARmh(x, p = 1, W = 10^3)
x |
time series to be modelled as an AR(p) model |
p |
order of the AR(p) model |
W |
number of iterations |
Even though Bayesian Essentials with R does not cover the reversible jump MCMC techniques due to Green (1995), which allows to explore spaces of different dimensions at once, this function relies on a simple form of reversible jump MCMC when moving from one number of complex roots to the next.
psis |
matrix of simulated |
mus |
vector of simulated |
sigs |
vector of simulated |
llik |
vector of corresponding likelihood values (useful to check for convergence) |
pcomp |
vector of simulated numbers of complex roots |
Green, P.J. (1995) Reversible jump MCMC computaton and Bayesian model choice. Biometrika 82, 711–732.
data(Eurostoxx50) x=Eurostoxx50[, 4] resAR5=ARmh(x=x,p=5,W=50) plot(resAR5$mus,type="l",col="steelblue4",xlab="Iterations",ylab=expression(mu))
data(Eurostoxx50) x=Eurostoxx50[, 4] resAR5=ARmh(x=x,p=5,W=50) plot(resAR5$mus,type="l",col="steelblue4",xlab="Iterations",ylab=expression(mu))
The bank dataset we analyze in the first part
of Chapter 3 comes from Flury and Riedwyl (1988) and
is made of four measurements on 100 genuine Swiss banknotes and 100 counterfeit ones.
The response variable is thus the status of the banknote, where 0 stands
for genuine and 1 stands for counterfeit, while the explanatory factors are bill
measurements.
data(bank)
data(bank)
A data frame with 200 observations on the following 5 variables.
x1
length of the bill (in mm)
x2
width of the left edge (in mm)
x3
width of the right edge (in mm)
x4
bottom margin width (in mm)
y
response variable
Flury, B. and Riedwyl, H. (1988) Multivariate Statistics. A Practical Approach, Chapman and Hall, London-New York.
data(bank) summary(bank)
data(bank) summary(bank)
This function contains the R code for the implementation of Zellner's -prior analysis of
the regression model as described in Chapter 3. The purpose of
BayesRef
is dual: first, this R function shows how easily automated
this approach can be. Second, it also illustrates how it is possible to get exactly the same
type of output as the standard R function summary(lm(y~X))
. In particular,
it calculates the Bayes factors for variable selection, more precisely single variable exclusion.
BayesReg(y, X, g = length(y), betatilde = rep(0, dim(X)[2]), prt = TRUE)
BayesReg(y, X, g = length(y), betatilde = rep(0, dim(X)[2]), prt = TRUE)
y |
response variable |
X |
matrix of regressors |
g |
constant g for the |
betatilde |
prior mean on |
prt |
boolean variable for printing out the standard output |
postmeancoeff |
posterior mean of the regression coefficients |
postsqrtcoeff |
posterior standard deviation of the regression coefficients |
log10bf |
log-Bayes factors against the full model |
postmeansigma2 |
posterior mean of the variance of the model |
postvarsigma2 |
posterior variance of the variance of the model |
data(faithful) BayesReg(faithful[,1],faithful[,2])
data(faithful) BayesReg(faithful[,1],faithful[,2])
The caterpillar
dataset is extracted from a
1973 study on pine processionary caterpillars.
The response variable is the
log transform of the number of nests per unit.
There are potential explanatory variables
and
areas.
data(caterpillar)
data(caterpillar)
A data frame with 33 observations on the following 9 variables.
x1
altitude (in meters)
x2
slope (in degrees)
x3
number of pine trees in the area
x4
height (in meters) of the tree sampled at the center of the area
x5
orientation of the area (from 1 if southbound to 2 otherwise)
x6
height (in meters) of the dominant tree
x7
number of vegetation strata
x8
mix settlement index (from 1 if not mixed to 2 if mixed)
y
logarithmic transform of the average number of nests of caterpillars per tree
This dataset is used in Chapter 3 on linear regression. It assesses the
influence of some forest settlement characteristics on the development of
caterpillar colonies. It was first published and studied in
Tomassone et al. (1993). The response variable is the
logarithmic transform of the average number of nests of caterpillars per tree
in an area of 500 square meters (which corresponds to the last column in
caterpillar
). There are potential explanatory variables
defined on
areas.
Tomassone, R., Dervin, C., and Masson, J.P. (1993) Biometrie: modelisation de phenomenes biologiques. Dunod, Paris.
data(caterpillar) summary(caterpillar)
data(caterpillar) summary(caterpillar)
The dataset used in Chapter 6 is derived from an image of a license plate, called license
and not provided in the package. The actual histogram of the grey levels is
concentrated on 256 values because of the poor resolution of the image, but
we transformed the original data as datha.txt
.
data(datha)
data(datha)
A data frame with 2625 observations on the following variable.
x
Grey levels
datha.txt
was produced by the following R code:
> license=jitter(license,10) > datha=log((license-min(license)+.01)/ + (max(license)+.01-license)) > write.table(datha,"datha.txt",row.names=FALSE,col.names=FALSE)
where jitter
is used to randomize the dataset and avoid repetitions
data(datha) datha=as.matrix(datha) range(datha)
data(datha) datha=as.matrix(datha) range(datha)
Dnadataset
is a base sequence corresponding to a complete HIV
(which stands for Human Immunodeficiency Virus)
genome where A, C, G, and T have been recoded as 1,2,3,4.
It is modelled as a hidden Markov chain and is used in Chapter 7.
data(Dnadataset)
data(Dnadataset)
A data frame with 9718 rows and two columns, the first one corresponding to the row number and the second one to the amino-acid value coded from 1 to 4.
data(Dnadataset) summary(Dnadataset)
data(Dnadataset) summary(Dnadataset)
This capture-recapture dataset on the European dipper bird covers 7 years (1981-1987 inclusive) of observations of captures within one of three zones. It is used in Chapter 5.
data(eurodip)
data(eurodip)
A data frame with 294 observations on the following 7 variables.
t1
non-capture/location on year 1981
t2
non-capture/location on year 1982
t3
non-capture/location on year 1983
t4
non-capture/location on year 1984
t5
non-capture/location on year 1985
t6
non-capture/location on year 1986
t7
non-capture/location on year 1987
The data consists of markings and recaptures of breeding adults
each year during the breeding period from early March to early June. Birds
were at least one year old when initially banded. In eurodip
,
each row gof seven digits corresponds to a capture-recapture story for a given dipper,
0 indicating an absence of capture that year and, in the case of a capture,
1, 2, or 3 representing the zone where the dipper is captured. This dataset corresponds to
three geographical zones covering 200 square kilometers in eastern France.
It was kindly provided to us by J.D. Lebreton.
Lebreton, J.-D., K. P. Burnham, J. Clobert, and D. R. Anderson. (1992) Modeling survival and testing biological hypotheses using marked animals: case studies and recent advances. Ecol. Monogr. 62, 67-118.
data(eurodip) summary(eurodip)
data(eurodip) summary(eurodip)
This dataset is a collection of four time series connected with the stock market. Those are the stock values of the four companies ABN Amro, Aegon, Ahold Kon., and Air Liquide, observed from January 1, 1998, to November 9, 2003.
data(Eurostoxx50)
data(Eurostoxx50)
A data frame with 1486 observations on the following 5 variables.
date
six-digit date
Abn
value of the ABN Amro stock
Aeg
value of the Aegon stock
Aho
value of the Ahold Kon. stock
AL
value of the Air Liquide stock
Those four companies are the first stocks (in alphabetical order) to appear in the financial index Eurostoxx50.
data(Eurostoxx50) summary(Eurostoxx50)
data(Eurostoxx50) summary(Eurostoxx50)
This function implements a regular Gibbs sampling algorithm on the
posterior distribution associated with a mixture of normal distributions,
taking advantage of the missing data structure. It then runs an averaging
of the simulations over all permutations of the component indices in order
to avoid incomplete label switching and to validate Chib's representation
of the evidence. This function reproduces gibbsnorm
as its first stage,
however it may be much slower because of its second stage.
gibbs(niter, datha, mix)
gibbs(niter, datha, mix)
niter |
number of Gibbs iterations |
datha |
sample vector |
mix |
list made of |
k |
number of components in the mixture (superfluous as it is invariant over the execution of the R code) |
mu |
matrix of the Gibbs samples on the |
sig |
matrix of the Gibbs samples on the |
prog |
matrix of the Gibbs samples on the mixture weights |
lolik |
vector of the observed log-likelihoods along iterations |
chibdeno |
denominator of Chib's approximation to the evidence (see example below) |
Chib, S. (1995) Marginal likelihood from the Gibbs output. J. American Statist. Associ. 90, 1313-1321.
faithdata=faithful[,1] mu=rnorm(3,mean=mean(faithdata),sd=sd(faithdata)/10) sig=1/rgamma(3,shape=10,scale=var(faithdata)) mix=list(k=3,p=rdirichlet(par=rep(1,3)),mu=mu,sig=sig) resim3=gibbs(100,faithdata,mix) lulu=order(resim3$lolik)[100] lnum1=resim3$lolik[lulu] lnum2=sum(dnorm(resim3$mu[lulu,],mean=mean(faithdata),sd=resim3$sig[lulu,],log=TRUE)+ dgamma(resim3$sig[lulu,],10,var(faithdata),log=TRUE)-2*log(resim3$sig[lulu,]))+ sum((rep(0.5,mix$k)-1)*log(resim3$p[lulu,]))+ lgamma(sum(rep(0.5,mix$k)))-sum(lgamma(rep(0.5,mix$k))) lchibapprox3=lnum1+lnum2-log(resim3$deno)
faithdata=faithful[,1] mu=rnorm(3,mean=mean(faithdata),sd=sd(faithdata)/10) sig=1/rgamma(3,shape=10,scale=var(faithdata)) mix=list(k=3,p=rdirichlet(par=rep(1,3)),mu=mu,sig=sig) resim3=gibbs(100,faithdata,mix) lulu=order(resim3$lolik)[100] lnum1=resim3$lolik[lulu] lnum2=sum(dnorm(resim3$mu[lulu,],mean=mean(faithdata),sd=resim3$sig[lulu,],log=TRUE)+ dgamma(resim3$sig[lulu,],10,var(faithdata),log=TRUE)-2*log(resim3$sig[lulu,]))+ sum((rep(0.5,mix$k)-1)*log(resim3$p[lulu,]))+ lgamma(sum(rep(0.5,mix$k)))-sum(lgamma(rep(0.5,mix$k))) lchibapprox3=lnum1+lnum2-log(resim3$deno)
This function implements a regular Gibbs sampler associated with Chapter 5 for a two-stage capture recapture model with open populations, accounting for the possibility that some individuals vanish between two successive capture experiments.
gibbscap1(nsimu, n1, c2, c3, N0 = n1/runif(1), r10, r20)
gibbscap1(nsimu, n1, c2, c3, N0 = n1/runif(1), r10, r20)
nsimu |
number of simulated values in the sample |
n1 |
first capture population size |
c2 |
number of individuals recaptured during the second experiment |
c3 |
number of individuals recaptured during the third experiment |
N0 |
starting value for the population size |
r10 |
starting value for the number of individuals who vanished between the first and second experiments |
r20 |
starting value for the number of individuals who vanished between the second and third experiments |
N |
Gibbs sample of the simulated population size |
p |
Gibbs sample of the probability of capture |
q |
Gibbs sample of the probability of leaving the population |
r1 |
Gibbs sample of the number of individuals who vanished between the first and second experiments |
r2 |
Gibbs sample of the number of individuals who vanished between the second and third experiments |
res=gibbscap1(100,32,21,15,200,10,5) plot(res$p,type="l",col="steelblue3",xlab="iterations",ylab="p")
res=gibbscap1(100,32,21,15,200,10,5) plot(res$p,type="l",col="steelblue3",xlab="iterations",ylab="p")
In the Arnason-Schwarz capture-recapture model (see Chapter 5), individual histories
are observed and missing steps can be inferred upon. For the dataset eurodip
,
the moves between regions can be reconstituted. This is the first instance
of a hidden Markov model met in the book.
gibbscap2(nsimu, z)
gibbscap2(nsimu, z)
nsimu |
numbed of simulation steps in the Gibbs sampler |
z |
data, capture history of each individual, with |
p |
Gibbs sample of capture probabilities across time |
phi |
Gibbs sample of survival probabilities across time and locations |
psi |
Gibbs sample of interstata movement probabilities across time and locations |
late |
Gibbs averages of completed histories |
data(eurodip) res=gibbscap2(10,eurodip[1:100,]) plot(res$p,type="l",col="steelblue3",xlab="iterations",ylab="p")
data(eurodip) res=gibbscap2(10,eurodip[1:100,]) plot(res$p,type="l",col="steelblue3",xlab="iterations",ylab="p")
This function implements a Gibbs sampler for a toy mixture problem (Chapter 6) with two Gaussian components and only the means unknown, so that likelihood and posterior surfaces can be drawn.
gibbsmean(p, datha, niter = 10^4)
gibbsmean(p, datha, niter = 10^4)
p |
first component weight |
datha |
dataset to be modelled as a mixture |
niter |
number of Gibbs iterations |
Sample of 's as a matrix of size
niter
x 2
dat=plotmix(plottin=FALSE)$sample simu=gibbsmean(0.7,dat,niter=100) plot(simu,pch=19,cex=.5,col="sienna",xlab=expression(mu[1]),ylab=expression(mu[2]))
dat=plotmix(plottin=FALSE)$sample simu=gibbsmean(0.7,dat,niter=100) plot(simu,pch=19,cex=.5,col="sienna",xlab=expression(mu[1]),ylab=expression(mu[2]))
This function implements the generic Gibbs sampler of Diebolt and Robert (1994)
for producing a sample from the posterior distribution associated
with a univariate mixture of normal components with all
parameters
unknown.
gibbsnorm(niter, dat, mix)
gibbsnorm(niter, dat, mix)
niter |
number of iterations in the Gibbs sampler |
dat |
mixture sample |
mix |
list defined as |
Under conjugate priors on the means (normal distributions), variances (inverse gamma
distributions), and weights (Dirichlet distribution), the full conditional distributions
given the latent variables are directly available and can be used in a straightforward
Gibbs sampler. This function is only the first step of the function gibbs
, but it
may be much faster as it avoids the computation of the evidence via Chib's approach.
k |
number of components (superfluous) |
mu |
Gibbs sample of all mean parameters |
sig |
Gibbs sample of all variance parameters |
p |
Gibbs sample of all weight parameters |
lopost |
sequence of log-likelihood values along Gibbs iterations |
Chib, S. (1995) Marginal likelihood from the Gibbs output. J. American Statist. Associ. 90, 1313-1321.
Diebolt, J. and Robert, C.P. (1992) Estimation of finite mixture distributions by Bayesian sampling. J. Royal Statist. Society 56, 363-375.
data(datha) datha=as.matrix(datha) mix=list(k=3,mu=mean(datha),sig=var(datha)) res=gibbsnorm(10,datha,mix) plot(res$p[,1],type="l",col="steelblue3",xlab="iterations",ylab="p")
data(datha) datha=as.matrix(datha) mix=list(k=3,mu=mean(datha),sig=var(datha)) res=gibbsnorm(10,datha,mix) plot(res$p[,1],type="l",col="steelblue3",xlab="iterations",ylab="p")
Under the assumption that the posterior distribution is well-defined,
this Metropolis-Hastings algorithm produces a sample from the
posterior distribution on the logit model coefficient
under a flat prior.
hmflatlogit(niter, y, X, scale)
hmflatlogit(niter, y, X, scale)
niter |
number of iterations |
y |
binary response variable |
X |
matrix of covariates with the same number of rows as |
scale |
scale of the Metropolis-Hastings random walk |
The function produces a sample of 's as a matrix of size
niter
x p
,
where p
is the number of covariates.
data(bank) bank=as.matrix(bank) y=bank[,5] X=bank[,1:4] flatlogit=hmflatlogit(1000,y,X,1) par(mfrow=c(1,3),mar=1+c(1.5,1.5,1.5,1.5)) plot(flatlogit[,1],type="l",xlab="Iterations",ylab=expression(beta[1])) hist(flatlogit[101:1000,1],nclass=50,prob=TRUE,main="",xlab=expression(beta[1])) acf(flatlogit[101:1000,1],lag=10,main="",ylab="Autocorrelation",ci=FALSE)
data(bank) bank=as.matrix(bank) y=bank[,5] X=bank[,1:4] flatlogit=hmflatlogit(1000,y,X,1) par(mfrow=c(1,3),mar=1+c(1.5,1.5,1.5,1.5)) plot(flatlogit[,1],type="l",xlab="Iterations",ylab=expression(beta[1])) hist(flatlogit[101:1000,1],nclass=50,prob=TRUE,main="",xlab=expression(beta[1])) acf(flatlogit[101:1000,1],lag=10,main="",ylab="Autocorrelation",ci=FALSE)
This version of hmflatlogit
operates on the log-linear model, assuming
that the posterior associated with the flat prior and the data
is well-defined. The proposal is based on a random walk
Metropolis-Hastings step.
hmflatloglin(niter, y, X, scale)
hmflatloglin(niter, y, X, scale)
niter |
number of iterations |
y |
binary response variable |
X |
matrix of covariates with the same number of rows as |
scale |
scale of the Metropolis-Hastings random walk |
The function produces a sample of 's as a matrix of size
niter
x p
,
where p
is the number of covariates.
airqual=na.omit(airquality) ozone=cut(airqual$Ozone,c(min(airqual$Ozone),median(airqual$Ozone),max(airqual$Ozone)), include.lowest=TRUE) month=as.factor(airqual$Month) tempe=cut(airqual$Temp,c(min(airqual$Temp),median(airqual$Temp),max(airqual$Temp)), include.lowest=TRUE) counts=table(ozone,tempe,month) counts=as.vector(counts) ozo=gl(2,1,20) temp=gl(2,2,20) mon=gl(5,4,20) x1=rep(1,20) lulu=rep(0,20) x2=x3=x4=x5=x6=x7=x8=x9=lulu x2[ozo==2]=x3[temp==2]=x4[mon==2]=x5[mon==3]=x6[mon==4]=1 x7[mon==5]=x8[ozo==2 & temp==2]=x9[ozo==2 & mon==2]=1 x10=x11=x12=x13=x14=x15=x16=lulu x10[ozo==2 & mon==3]=x11[ozo==2 & mon==4]=x12[ozo==2 & mon==5]=1 x13[temp==2 & mon==2]=x14[temp==2 & mon==3]=x15[temp==2 & mon==4]=1 x16[temp==2 & mon==5]=1 X=cbind(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16) flatloglin=hmflatloglin(1000,counts,X,0.5) par(mfrow=c(4,4),mar=1+c(1.5,1.5,1.5,1.5),cex=0.8) for (i in 1:16) plot(flatloglin[,i],type="l",ylab="",xlab="Iterations")
airqual=na.omit(airquality) ozone=cut(airqual$Ozone,c(min(airqual$Ozone),median(airqual$Ozone),max(airqual$Ozone)), include.lowest=TRUE) month=as.factor(airqual$Month) tempe=cut(airqual$Temp,c(min(airqual$Temp),median(airqual$Temp),max(airqual$Temp)), include.lowest=TRUE) counts=table(ozone,tempe,month) counts=as.vector(counts) ozo=gl(2,1,20) temp=gl(2,2,20) mon=gl(5,4,20) x1=rep(1,20) lulu=rep(0,20) x2=x3=x4=x5=x6=x7=x8=x9=lulu x2[ozo==2]=x3[temp==2]=x4[mon==2]=x5[mon==3]=x6[mon==4]=1 x7[mon==5]=x8[ozo==2 & temp==2]=x9[ozo==2 & mon==2]=1 x10=x11=x12=x13=x14=x15=x16=lulu x10[ozo==2 & mon==3]=x11[ozo==2 & mon==4]=x12[ozo==2 & mon==5]=1 x13[temp==2 & mon==2]=x14[temp==2 & mon==3]=x15[temp==2 & mon==4]=1 x16[temp==2 & mon==5]=1 X=cbind(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16) flatloglin=hmflatloglin(1000,counts,X,0.5) par(mfrow=c(4,4),mar=1+c(1.5,1.5,1.5,1.5),cex=0.8) for (i in 1:16) plot(flatloglin[,i],type="l",ylab="",xlab="Iterations")
This random walk Metropolis-Hastings algorithm takes advantage of the
availability of the maximum likelihood estimator (available via the glm
function) to center and scale the random walk in an efficient manner.
hmflatprobit(niter, y, X, scale)
hmflatprobit(niter, y, X, scale)
niter |
number of iterations |
y |
binary response variable |
X |
covariates |
scale |
scale of the random walk |
The function produces a sample of 's of size
niter
.
data(bank) bank=as.matrix(bank) y=bank[,5] X=bank[,1:4] flatprobit=hmflatprobit(1000,y,X,1) mean(flatprobit[101:1000,1])
data(bank) bank=as.matrix(bank) y=bank[,5] X=bank[,1:4] flatprobit=hmflatprobit(1000,y,X,1) mean(flatprobit[101:1000,1])
This function implements a Metropolis within Gibbs algorithm that produces
a sample on the parameters and
of the hidden Markov
model (Chapter 7). It includes a function
likej
that computes the likelihood of
the times series using a forward-backward algorithm.
hmhmm(M = 100, y)
hmhmm(M = 100, y)
M |
Number of Gibbs iterations |
y |
times series to be modelled by a hidden Markov model |
The Metropolis-within-Gibbs step involves Dirichlet proposals with a random choice of the scale between 1 and 1e5.
BigR |
matrix of the iterated values returned by the MCMC algorithm containing
|
olike |
sequence of the log-likelihoods produced by the MCMC sequence |
res=hmhmm(M=500,y=sample(1:4,10,rep=TRUE)) plot(res$olike,type="l",main="log-likelihood",xlab="iterations",ylab="")
res=hmhmm(M=500,y=sample(1:4,10,rep=TRUE)) plot(res$olike,type="l",main="log-likelihood",xlab="iterations",ylab="")
This function provides another toy illustration of the capabilities of a tempered random walk Metropolis-Hastings algorithm applied to the posterior distribution associated with a two-component normal mixture with only its means unknown (Chapter 7). It shows how a decrease in the temperature leads to a proper exploration of the target density surface, despite the existence of two well-separated modes.
hmmeantemp(dat, niter, var = 1, alpha = 1)
hmmeantemp(dat, niter, var = 1, alpha = 1)
dat |
set to be modelled as a mixture |
niter |
number of iterations |
var |
variance of the random walk |
alpha |
temperature, expressed as power of the likelihood |
When the function operates (and can be used) as a regular Metropolis-Hastings algorithm.
sample of 's as a matrix of size
niter
x 2
dat=plotmix(plot=FALSE)$sample simu=hmmeantemp(dat,1000) plot(simu,pch=19,cex=.5,col="sienna",xlab=expression(mu[1]),ylab=expression(mu[2]))
dat=plotmix(plot=FALSE)$sample simu=hmmeantemp(dat,1000) plot(simu,pch=19,cex=.5,col="sienna",xlab=expression(mu[1]),ylab=expression(mu[2]))
This function runs a Metropolis-Hastings algorithm that produces a sample from the
posterior distribution for the logit model (Chapter 4) coefficient
associated with a noninformative prior defined in the book.
hmnoinflogit(niter, y, X, scale)
hmnoinflogit(niter, y, X, scale)
niter |
number of iterations |
y |
binary response variable |
X |
matrix of covariates with the same number of rows as |
scale |
scale of the random walk |
sample of 's as a matrix of size
niter
x p, where p is the number
of covariates
data(bank) bank=as.matrix(bank) y=bank[,5] X=bank[,1:4] noinflogit=hmnoinflogit(1000,y,X,1) par(mfrow=c(1,3),mar=1+c(1.5,1.5,1.5,1.5)) plot(noinflogit[,1],type="l",xlab="Iterations",ylab=expression(beta[1])) hist(noinflogit[101:1000,1],nclass=50,prob=TRUE,main="",xlab=expression(beta[1])) acf(noinflogit[101:1000,1],lag=10,main="",ylab="Autocorrelation",ci=FALSE)
data(bank) bank=as.matrix(bank) y=bank[,5] X=bank[,1:4] noinflogit=hmnoinflogit(1000,y,X,1) par(mfrow=c(1,3),mar=1+c(1.5,1.5,1.5,1.5)) plot(noinflogit[,1],type="l",xlab="Iterations",ylab=expression(beta[1])) hist(noinflogit[101:1000,1],nclass=50,prob=TRUE,main="",xlab=expression(beta[1])) acf(noinflogit[101:1000,1],lag=10,main="",ylab="Autocorrelation",ci=FALSE)
This function is a version of hmnoinflogit
for the log-linear model, using a non-informative
prior defined in Chapter 4 and a proposal based on a random walk Metropolis-Hastings step.
hmnoinfloglin(niter, y, X, scale)
hmnoinfloglin(niter, y, X, scale)
niter |
number of iterations |
y |
binary response variable |
X |
matrix of covariates with the same number of rows as |
scale |
scale of the random walk |
The function produces a sample of 's as a matrix of size
niter
x p
,
where p
is the number of covariates.
airqual=na.omit(airquality) ozone=cut(airqual$Ozone,c(min(airqual$Ozone),median(airqual$Ozone),max(airqual$Ozone)), include.lowest=TRUE) month=as.factor(airqual$Month) tempe=cut(airqual$Temp,c(min(airqual$Temp),median(airqual$Temp),max(airqual$Temp)), include.lowest=TRUE) counts=table(ozone,tempe,month) counts=as.vector(counts) ozo=gl(2,1,20) temp=gl(2,2,20) mon=gl(5,4,20) x1=rep(1,20) lulu=rep(0,20) x2=x3=x4=x5=x6=x7=x8=x9=lulu x2[ozo==2]=x3[temp==2]=x4[mon==2]=x5[mon==3]=1 x6[mon==4]=x7[mon==5]=x8[ozo==2 & temp==2]=x9[ozo==2 & mon==2]=1 x10=x11=x12=x13=x14=x15=x16=lulu x10[ozo==2 & mon==3]=x11[ozo==2 & mon==4]=x12[ozo==2 & mon==5]=x13[temp==2 & mon==2]=1 x14[temp==2 & mon==3]=x15[temp==2 & mon==4]=x16[temp==2 & mon==5]=1 X=cbind(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16) noinloglin=hmnoinfloglin(1000,counts,X,0.5) par(mfrow=c(4,4),mar=1+c(1.5,1.5,1.5,1.5),cex=0.8) for (i in 1:16) plot(noinloglin[,i],type="l",ylab="",xlab="Iterations")
airqual=na.omit(airquality) ozone=cut(airqual$Ozone,c(min(airqual$Ozone),median(airqual$Ozone),max(airqual$Ozone)), include.lowest=TRUE) month=as.factor(airqual$Month) tempe=cut(airqual$Temp,c(min(airqual$Temp),median(airqual$Temp),max(airqual$Temp)), include.lowest=TRUE) counts=table(ozone,tempe,month) counts=as.vector(counts) ozo=gl(2,1,20) temp=gl(2,2,20) mon=gl(5,4,20) x1=rep(1,20) lulu=rep(0,20) x2=x3=x4=x5=x6=x7=x8=x9=lulu x2[ozo==2]=x3[temp==2]=x4[mon==2]=x5[mon==3]=1 x6[mon==4]=x7[mon==5]=x8[ozo==2 & temp==2]=x9[ozo==2 & mon==2]=1 x10=x11=x12=x13=x14=x15=x16=lulu x10[ozo==2 & mon==3]=x11[ozo==2 & mon==4]=x12[ozo==2 & mon==5]=x13[temp==2 & mon==2]=1 x14[temp==2 & mon==3]=x15[temp==2 & mon==4]=x16[temp==2 & mon==5]=1 X=cbind(x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x15,x16) noinloglin=hmnoinfloglin(1000,counts,X,0.5) par(mfrow=c(4,4),mar=1+c(1.5,1.5,1.5,1.5),cex=0.8) for (i in 1:16) plot(noinloglin[,i],type="l",ylab="",xlab="Iterations")
This function runs a Metropolis-Hastings algorithm that produces a sample from the
posterior distribution for the probit model coefficient
associated with a noninformative prior defined in Chapter 4.
hmnoinfprobit(niter, y, X, scale)
hmnoinfprobit(niter, y, X, scale)
niter |
number of iterations |
y |
binary response variable |
X |
matrix of covariates with the same number of rows as |
scale |
scale of the random walk |
The function produces a sample of 's as a matrix of size
niter
x p
,
where p
is the number of covariates.
data(bank) bank=as.matrix(bank) y=bank[,5] X=bank[,1:4] noinfprobit=hmflatprobit(1000,y,X,1) par(mfrow=c(1,3),mar=1+c(1.5,1.5,1.5,1.5)) plot(noinfprobit[,1],type="l",xlab="Iterations",ylab=expression(beta[1])) hist(noinfprobit[101:1000,1],nclass=50,prob=TRUE,main="",xlab=expression(beta[1])) acf(noinfprobit[101:1000,1],lag=10,main="",ylab="Autocorrelation",ci=FALSE)
data(bank) bank=as.matrix(bank) y=bank[,5] X=bank[,1:4] noinfprobit=hmflatprobit(1000,y,X,1) par(mfrow=c(1,3),mar=1+c(1.5,1.5,1.5,1.5)) plot(noinfprobit[,1],type="l",xlab="Iterations",ylab=expression(beta[1])) hist(noinfprobit[101:1000,1],nclass=50,prob=TRUE,main="",xlab=expression(beta[1])) acf(noinfprobit[101:1000,1],lag=10,main="",ylab="Autocorrelation",ci=FALSE)
This is the Metropolis-Hastings version of the original Gibbs algorithm
on the Ising model (Chapter 8). Its basic step only proposes changes of values at selected
pixels, avoiding the inefficient updates that do not modify the current value
of x
.
isinghm(niter, n, m=n,beta)
isinghm(niter, n, m=n,beta)
niter |
number of iterations of the algorithm |
n |
number of rows in the grid |
m |
number of columns in the grid |
beta |
Ising parameter |
x
, a realisation from the Ising distribution as a n
x m
matrix
of 0's and 1's
prepa=runif(1,0,2) prop=isinghm(10,24,24,prepa) image(1:24,1:24,prop)
prepa=runif(1,0,2) prop=isinghm(10,24,24,prepa) image(1:24,1:24,prop)
This is the original Geman and Geman (1984) Gibbs sampler
on the Ising model that gave its name to the method. It simulates
an grid from the Ising distribution.
isingibbs(niter, n, m=n, beta)
isingibbs(niter, n, m=n, beta)
niter |
number of iterations of the algorithm |
n |
number of rows in the grid |
m |
number of columns in the grid |
beta |
Ising parameter |
x
, a realisation from the Ising distribution
as a matrix of size n
x m
Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell., 6, 721–741.
image(1:20,1:20,isingibbs(10,20,20,beta=0.3))
image(1:20,1:20,isingibbs(10,20,20,beta=0.3))
This dataset depicts the presence of plants (tufted sedges) in a part of a wetland. It is 25x25 matrix of zeroes and ones, used in Chapter 8.
data(Laichedata)
data(Laichedata)
A data frame corresponding to a 25x25 matrix of zeroes and ones.
data(Laichedata) image(as.matrix(Laichedata))
data(Laichedata) image(as.matrix(Laichedata))
Direct computation of the logarithm of the likelihood of a standard logit model (Chapter 4)
logitll(beta, y, X)
logitll(beta, y, X)
beta |
coefficient of the logit model |
y |
vector of binary response variables |
X |
covariate matrix |
returns the logarithm of the logit likelihood for the data y
,
covariate matrix X
and parameter vector beta
data(bank) y=bank[,5] X=as.matrix(bank[,-5]) logitll(runif(4),y,X)
data(bank) y=bank[,5] X=as.matrix(bank[,-5]) logitll(runif(4),y,X)
This function computes the logarithm of the posterior density associated with a logit model and the noninformative prior used in Chapter 4.
logitnoinflpost(beta, y, X)
logitnoinflpost(beta, y, X)
beta |
parameter of the logit model |
y |
binary response variable |
X |
covariate matrix |
returns the logarithm of the logit likelihood for the data y
,
covariate matrix X
and parameter beta
data(bank) y=bank[,5] X=as.matrix(bank[,-5]) logitnoinflpost(runif(4),y,X)
data(bank) y=bank[,5] X=as.matrix(bank[,-5]) logitnoinflpost(runif(4),y,X)
This function provides a direct computation of the logarithm of the likelihood of a standard log-linear model, as defined in Chapter 4.
loglinll(beta, y, X)
loglinll(beta, y, X)
beta |
coefficient of the logit model |
y |
vector of binary response variables |
X |
covariate matrix |
returns the logarithmic value of the logit likelihood for the data y
,
covariate matrix X
and parameter vector beta
X=matrix(rnorm(20*3),ncol=3) beta=c(3,-2,1) y=rpois(20,exp(X%*%beta)) loglinll(beta, y, X)
X=matrix(rnorm(20*3),ncol=3) beta=c(3,-2,1) y=rpois(20,exp(X%*%beta)) loglinll(beta, y, X)
This function computes the logarithm of the posterior density associated with a log-linear model and the noninformative prior used in Chapter 4.
loglinnoinflpost(beta, y, X)
loglinnoinflpost(beta, y, X)
beta |
parameter of the log-linear model |
y |
binary response variable |
X |
covariate matrix |
This function does not test for coherence between the lengths of y
,
X
and beta
, hence may return an error message in case of
incoherence.
returns the logarithm of the logit posterior density for the data y
,
covariate matrix X
and parameter vector beta
X=matrix(rnorm(20*3),ncol=3) beta=c(3,-2,1) y=rpois(20,exp(X%*%beta)) loglinnoinflpost(beta, y, X)
X=matrix(rnorm(20*3),ncol=3) beta=c(3,-2,1) y=rpois(20,exp(X%*%beta)) loglinnoinflpost(beta, y, X)
This function returns the numerical value of the log-likelihood associated with a time series and an MA(p) model in Chapter 7. It either uses the natural parameterisation of the MA(p) model
or the parameterisation via the lag-polynomial roots
where .
MAllog(p,dat,pr,pc,lr,lc,mu,sig2,compsi=T,pepsi=rep(0,p),eps=rnorm(p))
MAllog(p,dat,pr,pc,lr,lc,mu,sig2,compsi=T,pepsi=rep(0,p),eps=rnorm(p))
p |
order of the MA model |
dat |
time series modelled by the MA(p) model |
pr |
number of real roots in the lag polynomial |
pc |
number of complex roots in the lag polynomial, necessarily even |
lr |
real roots |
lc |
complex roots, stored as real part for odd indices and
imaginary part for even indices. ( |
mu |
drift parameter |
sig2 |
variance of the Gaussian white noise |
compsi |
boolean variable indicating whether the coefficients |
pepsi |
potential coefficients |
eps |
white noise terms |
ll |
value of the log-likelihood |
ps |
vector of the |
MAllog(p=3,dat=faithful[,1],pr=3,pc=0,lr=rep(.1,3),lc=0, mu=0,sig2=var(faithful[,1]),compsi=FALSE,pepsi=rep(.1,3),eps=rnorm(3))
MAllog(p=3,dat=faithful[,1],pr=3,pc=0,lr=rep(.1,3),lc=0, mu=0,sig2=var(faithful[,1]),compsi=FALSE,pepsi=rep(.1,3),eps=rnorm(3))
This function implements a Metropolis–Hastings algorithm on the coefficients of the MA(p) model, involving the simulation of the real and complex roots of the model. The algorithm includes jumps between adjacent numbers of real and complex roots, as well as random modifications for a given number of real and complex roots. It is thus a special case of a reversible jump MCMC algorithm (Green, 1995).
MAmh(x, p = 1, W = 10^3)
MAmh(x, p = 1, W = 10^3)
x |
time series to be modelled as an MA(p) model |
p |
order of the MA(p) model |
W |
number of iterations |
psis |
matrix of simulated |
mus |
vector of simulated |
sigs |
vector of simulated |
llik |
vector of corresponding log-likelihood values (useful to check for convergence) |
pcomp |
vector of simulated numbers of complex roots |
Green, P.J. (1995) Reversible jump MCMC computaton and Bayesian model choice. Biometrika 82, 711–732.
data(Eurostoxx50) x=Eurostoxx50[1:350, 5] resMA5=MAmh(x=x,p=5,W=50) plot(resMA5$mus,type="l",col="steelblue4",xlab="Iterations",ylab=expression(mu))
data(Eurostoxx50) x=Eurostoxx50[1:350, 5] resMA5=MAmh(x=x,p=5,W=50) plot(resMA5$mus,type="l",col="steelblue4",xlab="Iterations",ylab=expression(mu))
This dataset is a 100x100 pixel satellite image of the lake of Menteith, near Stirling, Scotland. The purpose of analyzing this satellite dataset is to classify all pixels into one of six states in order to detect some homogeneous regions.
data(Menteith)
data(Menteith)
data frame of a 100 x 100 image with 106 grey levels
data(Menteith) image(1:100,1:100,as.matrix(Menteith),col=gray(256:1/256),xlab="",ylab="")
data(Menteith) image(1:100,1:100,as.matrix(Menteith),col=gray(256:1/256),xlab="",ylab="")
This function computes the posterior probabilities of all (for less than 15 covariates) or the most probable (for more than 15 covariates) submodels obtained by eliminating some covariates.
ModChoBayesReg(y, X, g = length(y), betatilde = rep(0, dim(X)[2]), niter = 1e+05, prt = TRUE)
ModChoBayesReg(y, X, g = length(y), betatilde = rep(0, dim(X)[2]), niter = 1e+05, prt = TRUE)
y |
response variable |
X |
covariate matrix |
g |
constant in the |
betatilde |
prior expectation of the regression coefficient |
niter |
number of Gibbs iterations in the case there are more than 15 covariates |
prt |
boolean variable for printing the standard output |
When using a conjugate prior for the linear model such as the prior,
the marginal likelihood and hence the evidence are available in closed form. If the number
of explanatory variables is less than 15, the exact
derivation of the posterior probabilities for all submodels can be undertaken.
Indeed,
means that the problem remains tractable.
When the number of explanatory variables gets larger, a random exploration of the collection
of submodels becomes necessary, as explained in the book (Chapter 3). The proposal to change
one variable indicator is made at random and accepting this move follows from a Metropolis–Hastings
step.
top10models |
models with the ten largest posterior probabilities |
postprobtop10 |
posterior probabilities of those ten most likely models |
data(caterpillar) y=log(caterpillar$y) X=as.matrix(caterpillar[,1:8]) res2=ModChoBayesReg(y,X)
data(caterpillar) y=log(caterpillar$y) X=as.matrix(caterpillar[,1:8]) res2=ModChoBayesReg(y,X)
This dataset is used as "the" normal dataset in Chapter 2. It is linked with the famous Michelson-Morley experiment that opened the way to Einstein's relativity theory in 1887. It corresponds to the more precise experiment of Illingworth in 1927. The datapoints are measurment of differences in the speeds of two light beams travelling the same distance in two orthogonal directions.
data(normaldata)
data(normaldata)
A data frame with 64 observations on the following 2 variables.
x1
index of the experiment
x2
averaged fringe displacement in the experiment
The 64 data points in this dataset are associated with session numbers, corresponding to two different times of the day, and they represent the averaged fringe displacement due to orientation taken over ten measurements made by Illingworth, who assumed a normal error model.
data(normaldata) shift=matrix(normaldata,ncol=2,byrow=TRUE)[,2] hist(shift[[1]],nclass=10,col="steelblue",prob=TRUE,main="")
data(normaldata) shift=matrix(normaldata,ncol=2,byrow=TRUE)[,2] hist(shift[[1]],nclass=10,col="steelblue",prob=TRUE,main="")
This function provides an estimation of the number of dippers by a posterior expectation,
based on a uniform prior and the eurodip
dataset, as described in Chapter 5.
pbino(nplus)
pbino(nplus)
nplus |
number of different dippers captured |
returns a numerical value that estimates the number of dippers in the population
data(eurodip) year81=eurodip[,1] nplus=sum(year81>0) sum((1:400)*pbino(nplus))
data(eurodip) year81=eurodip[,1] nplus=sum(year81>0) sum((1:400)*pbino(nplus))
This function computes the posterior expectation of the population size for a multiple stage capture-recapture experiment (Chapter 5) under a uniform prior on the range (0,400).
pcapture(T, nplus, nc)
pcapture(T, nplus, nc)
T |
number of experiments |
nplus |
total number of captured animals |
nc |
total number of captures |
This analysis is based on the restrictive assumption that all dippers captured in the second year were already present in the population during the first year.
numerical value of the posterior expectation
sum((1:400)*pcapture(2,70,81))
sum((1:400)*pcapture(2,70,81))
This function computes the posterior expectation of the population size for a two-stage Darroch capture-recapture experiment (Chapter 5) under a uniform prior on the range (0,400).
pdarroch(n1, n2, m2)
pdarroch(n1, n2, m2)
n1 |
size of the first capture experiment |
n2 |
size of the second capture experiment |
m2 |
number of recaptured individuals |
This model can be seen as a conditional version of the two-stage model when
conditioning on both sample sizes and
.
numerical value of the posterior expectation
for (i in 6:16) print(round(sum(pdarroch(22,43,i)*1:400)))
for (i in 6:16) print(round(sum(pdarroch(22,43,i)*1:400)))
This function gives an image
representation of the log-likelihood
surface of a mixture (Chapter 6) of two normal densities with means
and
unknown. It first generates the random sample associated
with the distribution.
plotmix(mu1 = 2.5, mu2 = 0, p = 0.7, n = 500, plottin = TRUE, nl = 50)
plotmix(mu1 = 2.5, mu2 = 0, p = 0.7, n = 500, plottin = TRUE, nl = 50)
mu1 |
first mean |
mu2 |
second mean |
p |
weight of the first component |
n |
number of observations |
plottin |
boolean variable to plot the surface (or not) |
nl |
number of contours |
In this case, the parameters are identifiable:
and
cannot be confused when
is not 0.5.
Nonetheless, the log-likelihood surface in this figure often
exhibits two modes, one being close to the true value of the parameters
used to simulate the dataset and one corresponding to a reflected separation of
the dataset into two homogeneous groups.
sample |
the simulated sample |
like |
the discretised representation of the log-likelihood surface |
resumix=plotmix()
resumix=plotmix()
This function produces one simulation of a square numb
by numb
grid
from a Potts distribution with four colours and a four neighbour
structure, relying on niter
iterations of a standard Gibbs sampler.
pottsgibbs(niter, numb, beta)
pottsgibbs(niter, numb, beta)
niter |
number of Gibbs iterations |
numb |
size of the square grid |
beta |
parameter of the Potts model |
returns a random realisation from the Potts model
Geman, S. and Geman, D. (1984) Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell., 6, 721–741.
ex=pottsgibbs(100,15,.4) image(ex)
ex=pottsgibbs(100,15,.4) image(ex)
ncol
classes
This function returns a simulation of a n
by m
grid
from a Potts distribution with ncol
colours and a four neighbour
structure, using a Metropolis-Hastings step that avoids proposing
a value identical to the current state of the Markov chain.
pottshm(ncol=2,niter=10^4,n,m=n,beta=0)
pottshm(ncol=2,niter=10^4,n,m=n,beta=0)
ncol |
number of colors |
niter |
number of Metropolis-Hastings iterations |
n |
number of rows in the image |
m |
number of columns in the image |
beta |
parameter of the Potts model |
returns a random realisation from the Potts model
ex=pottshm(niter=50,n=15,beta=.4) hist(ex,prob=TRUE,col="steelblue",main="pottshm()")
ex=pottshm(niter=50,n=15,beta=.4) hist(ex,prob=TRUE,col="steelblue",main="pottshm()")
by the Beta cdf
This function computes the coverage of the interval by the Beta
distribution.
probet(a, b, c, alpha)
probet(a, b, c, alpha)
a |
lower bound of the prior 95%~confidence interval |
b |
upper bound of the prior 95%~confidence interval |
c |
mean parameter of the prior distribution |
alpha |
scale parameter of the prior distribution |
numerical value between 0 and 1 corresponding to the coverage
probet(.1,.5,.3,2)
probet(.1,.5,.3,2)
This function implements a direct computation of the logarithm of the likelihood of a standard probit model
probitll(beta, y, X)
probitll(beta, y, X)
beta |
coefficient of the probit model |
y |
vector of binary response variables |
X |
covariate matrix |
returns the logarithm of the probit likelihood for the data y
,
covariate matrix X
and parameter vector beta
data(bank) y=bank[,5] X=as.matrix(bank[,-5]) probitll(runif(4),y,X)
data(bank) y=bank[,5] X=as.matrix(bank[,-5]) probitll(runif(4),y,X)
This function computes the logarithm of the posterior density associated with a probit model and the non-informative prior used in Chapter 4.
probitnoinflpost(beta, y, X)
probitnoinflpost(beta, y, X)
beta |
parameter of the probit model |
y |
binary response variable |
X |
covariate matrix |
returns the logarithm of the posterior density associated with a
logit model for the data y
,
covariate matrix X
and parameter beta
data(bank) y=bank[,5] X=as.matrix(bank[,-5]) probitnoinflpost(runif(4),y,X)
data(bank) y=bank[,5] X=as.matrix(bank[,-5]) probitnoinflpost(runif(4),y,X)
This function simulates a sample from
a Dirichlet distribution on the dimensional simplex with
arbitrary parameters.
The simulation is based on a renormalised vector of gamma variates.
rdirichlet(n = 1, par = rep(1, 2))
rdirichlet(n = 1, par = rep(1, 2))
n |
number of simulations |
par |
parameters of the Dirichlet distribution, whose length determines
the value of |
Surprisingly, there is no default Dirichlet distribution generator in the R base
packages like MASS
or stats
. This function can be used in full
generality, apart from the book (Chapter 6).
returns a matrix of Dirichlet simulations
apply(rdirichlet(10,rep(3,5)),2,mean)
apply(rdirichlet(10,rep(3,5)),2,mean)
This function adresses the reconstruction of an image distributed from a Potts model based on a noisy version of this image. The purpose of image segmentation (Chapter 8) is to cluster pixels into homogeneous classes without supervision or preliminary definition of those classes, based only on the spatial coherence of the structure. The underlying algorithm is an hybrid Gibbs sampler.
reconstruct(niter, y)
reconstruct(niter, y)
niter |
number of Gibbs iterations |
y |
blurred image defined as a matrix |
Using a Potts model on the true image, and uniform priors on
the genuine parameters of the model, the hybrid Gibbs sampler generates
the image pixels and the other parameters one at a time,
the hybrid stage being due to the Potts model parameter, since
it implies using a numerical integration via integrate
.
The code includes (or rather excludes!) the numerical integration via the vector dali
,
which contains the values of the integration over a 21 point grid, since
this numerical integration is extremely time-consuming.
beta |
MCMC chain for the parameter |
mu |
MCMC chain for the mean parameter of the blurring model |
sigma |
MCMC chain for the variance parameter of the blurring model |
xcum |
frequencies of simulated colours at every pixel of the image |
## Not run: data(Menteith) lm3=as.matrix(Menteith) #warning, this step is a bit lengthy titus=reconstruct(20,lm3) #allocation function affect=function(u) order(u)[6] # aff=apply(titus$xcum,1,affect) aff=t(matrix(aff,100,100)) par(mfrow=c(2,1)) image(1:100,1:100,lm3,col=gray(256:1/256),xlab="",ylab="") image(1:100,1:100,aff,col=gray(6:1/6),xlab="",ylab="") ## End(Not run)
## Not run: data(Menteith) lm3=as.matrix(Menteith) #warning, this step is a bit lengthy titus=reconstruct(20,lm3) #allocation function affect=function(u) order(u)[6] # aff=apply(titus$xcum,1,affect) aff=t(matrix(aff,100,100)) par(mfrow=c(2,1)) image(1:100,1:100,lm3,col=gray(256:1/256),xlab="",ylab="") image(1:100,1:100,aff,col=gray(6:1/6),xlab="",ylab="") ## End(Not run)
In the capture-recapture experiment of Chapter 5, the prior information
is represented by a prior expectation and prior confidence intervals. This
function derives the corresponding beta
prior distribution by a divide-and-conquer scheme.
solbeta(a, b, c, prec = 10^(-3))
solbeta(a, b, c, prec = 10^(-3))
a |
lower bound of the prior 95%~confidence interval |
b |
upper bound of the prior 95%~confidence interval |
c |
mean of the prior distribution |
prec |
maximal precision on the beta coefficient |
Since the mean of the beta distribution is known, there is a single free parameter
to determine, since
. The function
solbeta
searches for
the corresponding value of , starting with a precision of
and stopping
at the requested precision
prec
.
alpha |
first coefficient of the beta distribution |
beta |
second coefficient of the beta distribution |
solbeta(.1,.5,.3,10^(-4))
solbeta(.1,.5,.3,10^(-4))
This function implements a path sampling approximation of the normalising constant of an Ising model with a four neighbour relation.
sumising(niter = 10^3, numb, beta)
sumising(niter = 10^3, numb, beta)
niter |
number of iterations |
numb |
size of the square grid for the Ising model |
beta |
Ising model parameter |
returns a vector of 21 values for corresponding to a regular sequence
of
's between 0 and 2
Z=seq(0,2,length=21) for (i in 1:21) Z[i]=sumising(5,numb=24,beta=Z[i]) lrcst=approxfun(seq(0,2,length=21),Z) plot(seq(0,2,length=21),Z,xlab="",ylab="") curve(lrcst,0,2,add=TRUE)
Z=seq(0,2,length=21) for (i in 1:21) Z[i]=sumising(5,numb=24,beta=Z[i]) lrcst=approxfun(seq(0,2,length=21),Z) plot(seq(0,2,length=21),Z,xlab="",ylab="") curve(lrcst,0,2,add=TRUE)
This function is used in ardipper
to determine the bound for
the accept-reject algorithm simulating the non-standard conditional distribution
of .
thresh(k, n1, c2, c3, r2, q1)
thresh(k, n1, c2, c3, r2, q1)
k |
current proposal for the number of individuals vanishing between the first and second experiments |
n1 |
first capture population size |
c2 |
number of individuals recaptured during the second experiment |
c3 |
number of individuals recaptured during the third experiment |
r2 |
number of individuals vanishing between the second and third experiments |
q1 |
probability of disappearing from the population |
This upper bound is equal to
numerical value of the upper bound, to be compared with the uniform random draw
## Not run: if (runif(1) < thresh(y,n1,c2,c3,r2,q1))
## Not run: if (runif(1) < thresh(y,n1,c2,c3,r2,q1))
This is a plain random generator for a normal variate
truncated
to
, using the inverse cdf
qnorm
. It may thus
be imprecise for extreme values of the bounds.
truncnorm(n, mu, tau2, a, b)
truncnorm(n, mu, tau2, a, b)
n |
number of simulated variates |
mu |
mean of the original normal |
tau2 |
variance of the original normal |
a |
lower bound |
b |
upper bound |
a sample of real numbers over with size
n
x=truncnorm(10^3,1,2,3,4) hist(x,nclass=123,col="wheat",prob=TRUE)
x=truncnorm(10^3,1,2,3,4) hist(x,nclass=123,col="wheat",prob=TRUE)
This is a basis function used in simulation algorithms on the Ising
and Potts models. It counts how many of the four neighbours
of are of the same colour as this
pixel.
xneig4(x, a, b, col)
xneig4(x, a, b, col)
x |
grid of coloured pixels |
a |
row index |
b |
column index |
col |
current or proposed colour |
integer between 0 and 4 giving the number of neighbours with the same colour
data(Laichedata) xneig4(Laichedata,2,3,1) xneig4(Laichedata,2,3,0)
data(Laichedata) xneig4(Laichedata,2,3,1) xneig4(Laichedata,2,3,0)