联系方式

  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-23:00
  • 微信:codinghelp

您当前位置:首页 >> Algorithm 算法作业Algorithm 算法作业

日期:2019-11-24 07:47

Statistical Programming - Individual Project 2

Question 1

MCMC methods scale up directly to higher dimensions. Suppose we have a

distribution that generates two dimensional vectors y = (y1, y2), such as the

bivariate Gaussian distribution. One possible MCMC algorithm is to make

joint proposals for y1 and y2, i.e.

Write a MCMC sampler to sample from the bivariate Gaussian distribution

with mean vector µ = (1, 1) and covariance matrix:Draw 10,000 samples from this distribution

and plot the density of your samples as either a contour plot, heatmap, or

a 2 dimensional density (the base R function persp can be used for this, and the

ggplot2 libary includes functions for heat maps and contour plots. You could

look up the stat density2d() and scale fill gradient() functions). Include your

plot in your submission file.

Question 2

In the lecture I illustrated MCMC with the example of simulating from a

Student-t distribution using a Gaussian random walk proposal. This question

will expore whether this is a sensible idea.

1. Use MCMC to draw 10,000 samples from the Student-t(20) (i.e. with

v = 20) distribution using random walk Metropolis-Hastings and a N(0, 1)

proposal distribution. If these samples are genuine draws from the correct

Student-t distribution, then the variance of the sampled values should be

close to the theoretical variance of the Student-t(20) distribution (up to

some small sampling error). Verify that this is indeed the case. Include

your sample variance in your submission file.

1

2. Use the same procedue to draw samples from the Student-t(3) distribution.

Again compare the variance of your samples to the theoretical variance

of the Student-t(3) (i.e. with v = 3) distribution. Are they equal? If

not, then give an explanation why. Include your sample variance in your

submission file.

Question 3

This question compares frequentist and Bayesian approaches to parameter estimation

and prediction. It will hopefully give you an insight into the difference

between these two ways of viewing statistical inference.

We will focus on modelling time-between-event data, i.e. the time that

elapses between the occurrence of events like hurricanes or earthquakes. For

t ∈ {1, 2, . . . , n+ 1} let zt denote the time at which the z

th event occurred at, so

that yt = zt+1 − zt are the time between events, also known as the inter-event

times.

The most basic statistical model for this sort of data is to assume that the

event times follow a Poisson process, in which case the inter-event times have a

yt ∼ Exponential(λ) distribution. However this model is often not good for real

world data, since it only has one parameter (λ) and hence the mean and variance

of the distribution cannot be modelled separately. An alternative distribution

that is often used is the Weibull distribution, which has two parameters (k, λ)

and density function:

Frequentist Inference

Download the file eventtimes.csv from the course webpage and load it into R.

This is a vector y = (y1, . . . , y100) containing 100 inter-event times which correspond

to the number of weeks which occurred between rainfall days in a

particualr city (e.g. y1 = 2.146 so there were 2.146 weeks between the time of

the first rainfall, and the second). This data is to be modelled with a Weibull

distribution. You are told that the value of λ is λ = 2, and hence you only need

to estimate k.

1. Derive the log-likelihood function of the Weibull distribution with n observations

as a function of k, and implement this in R.

2. Use optim() to find the maximum likelihood estimate of k for the eventtimes.csv

data. Note that we can use optim() to maximise rather than

minimise a function by including the argument control=list(fnscale= -1)

as an argument to this function. Your function call should hence look

something like this:

2

optim(initval, fn, lambda=lambda, y=y, control=list(fnscale = -1))

3. Suppose that it rained today. Compute the probability of there being

more than 1.5 weeks until the next time that rain occurs, assuming that

k is equal to its maximum likelihood estimate (i.e. if ˜y is the number of

weeks until the next rainfall, then compute p(˜y > 1.5|k = ˆk, λ = 10) where

ˆk is the MLE.

Bayesian Inference

In the previous section, we predicted the future by assuming that k was equal to

its maximum likelihood estimate. However our prediction will be misleadingly

confident since it ignore the uncertainty in the estimate of k – the true value of

k will not be exactly equal to the MLE! Bootstrapping can help this to some

degree, since it incorporates uncertainty about k. Another approach is to use a

Bayesian approach where we work with the whole posterior distribution for k.

Again, we suppose that λ = 2 is known.

4. We will use a Gamma(1, 1) prior for k. Write an R function to implement

the resulting log posterior distribution (i.e. the log of the Weibull

likelihood multiplied by a Gamma(1,1) distribution).

When we want to make prediction aobut future event times, the Bayesian

approach is to average over the posterior. In other words given observed data y

(which here consists of the 100 inter-event times), then if ˜y is a future inter-event

time, then we would predict its value using:

p(˜y|y) = Z

p(˜y|k)p(k|y)dk

Since we do not have a conjugate prior, it will not be possible to solve this

analytically. The usual approach is to instead sample values k

(1), k(2), . . . , k(M)

from the posterior p(k|y) and use these to approximate the integral:

p(˜y|y) = Z

So for example for some value z we can write:

Note how this compares to the frequentist approach above: the frequentist predicts

by using p(˜y|

ˆk) where ˆk is the maximum likelihood estimate, whereas the

Bayesian predicts by averaging p(˜y|k

(i)

) over values sampled from the posterior.

3

5. Use MCMC to to draw 10, 000 samples from the posterior distribution

p(k|y) where y is the above data. Based on your samples, report the

posterior mean, median, and standard deviation (note: these should be

close to the MLE!).

6. Using your samples from above, estimate the Bayesian probability that

it will be more than 1.5 weeks until the next time that it rains. (i.e.

p(˜y > 1.5|y)). Quantify the level of uncertainty in this prediction by also

reporting its standard deviation (i.e. the standard deviation of p(˜y >1.5|θi) across the θi

samples.

Laplace Approxmation

An alternative to MCMC which is sometimes used in Bayesian statistics is the

Laplace Approximation. This is often not very good in terms of performance,

but it is quick and simple to implement.

The Laplace approximation is based on the fact that as the number of observations

increases, the posterior distribution will asymptotically converge to

a Normal distribution (Why? Because we know the likelihood function will

asymptotically be Normal, and as the number of data poitns increase, the prior

becomes less and less relevant so the posterior converges to the likelihood). As

such, we can sometimes approximate the posterior distribution by a Normal

distribution.

Specifically, we approximate the posterior by a N(ˆk, σ2k) distribution, whereˆk is the value of k which maximises the posterior, and σ2k

is 1 divided by the

negative of the second derivative of the log-posterior evaluated at ˆk, i.e.σ2k = −1∂2

∂k2 log p(k = ˆk|y)

7. Sample 10,000 values from the Laplace approximation to the Weibull distribution

above, with the Gamma prior (i.e. compute the mean and standard

deviation of the above Normal distribution). Compute the mean,

median and standard deviation of the posterior. Hint: when you are using

optim() to find the posterior maximum, you can add the argument hessian=TRUE

and R will return the second derivative of the log-posterior

evaluated at its maximum value, which is the quantity you need in the

formula for σ2k

above – you do not need to calculate this manually!

8. On the same plot, plot a kernel density estimate (or histogram) of the

10,000 samples you got above using MCMC, and the 10,000 samples you

got using the Laplace approximation (use a differrent colour for each).

Include this plot in your submission file.

4


版权所有:留学生编程辅导网 2020 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。 站长地图

python代写
微信客服:codinghelp