Midterm Project
Names:
Student IDs:
1 Problem Description
Read article: Maximum Likelihood Algorithms for Generalized Linear Mixed Models (McCulloch
1997), and try to understand the basic concept of generalized linear mixed model
(GLMM). Section 3.1 in this paper described a Monte Carlo EM (MCEM) method to derive
Maximum Likelihood Estimates (MLE). Please use your own software (Matlab, R, C++,
etc.) to perform following simulation study and answer related questions.
1.1 Model and Notations
In this project, we consider a clustering problem. Suppose we have observed n observations,
each observation is a binary process, i.e. the response Yij = 0 or 1, i = 1, ..., n,
j = 1, ..., T. Here n is the number of subjects and T is the length of observation. In general,
T might vary across subjects, time points may also be different. In this project, however, we
simply assume that all subjects have common time length and time points. We also assume
that these subjects belong to two clusters. For each cluster, the conditional expectation of
1
response variable is:
Pij ≡ E(Yij |Ui = 1, X1,ij , Z1,i) = g1
(β1X1,ij + Z1,i)
Pij ≡ E(Yij |Ui = 2, X2,ij , Z2,i) = g1
(β2X2,ij + Z2,i) (1)
where U is cluster membership, Xc,ij and Zc,i (c = 1, 2) are fixed and random effects,
respectively. The link function g1(x) = exp(x)
1+exp(x)
is given. In a typical clustering problem, U
is usually unknown, and hence we treat U as another random effect.
For random effects, we assume that Zc,i ~ N(0, σ2c) and P(U = 1) = π1 (then π2 =π1).
Then the parameter to be estimated is .= {β1, β2, σ1, σ2, π1}. Treating random effects as
missing data, one can write the complete data likelihood function as
where fc(Zc,i) is the density function of Normal distribution, fc(Yij |Zc,i) = P
ωic is the dummy variable of Ui, i.e.
1 if subject i belongs to cluster c
0 otherwise,
1.2 Simulation Setup and Requirement
Generate 100 simulations. In each simulation, set n = 100 and T = 10. The true values of
parameter are: β1 = β2 = 1, π1 = 0.6,σ1 = 2, and σ2 = 10.
Before you start, use N(0, 1) to generate the fixed effect X, and use them for all 100
simulations. Please follow the paper mentioned earlier and use MCEM to evaluate the loglikelihood
function. In the E-step, perform K = 500 Gibbs sampling incorporated with a
Metropolis-Hastings step, and drop the first 100 as a burn-in procedure.
2
1.3 Your Report
(1) Please write down the form of Monte Carlo average of log-likelihood (which you are
going to evaluate)
(2) Please write down details of the EM algorithm you use for this simulation, especially
the Metropolis-Hastings steps.
(3) What are your initial values? What is your convergence rule?
(4) How to accelerate your EM algorithm? Any improvement you can observe?
(4) Try different numbers of simulations: 200,300,...,1000. And plot the corresponding
MSE.
(5) Write a report in either Chinese or English. Please attach your code to your report.
3
版权所有:编程辅导网 2021 All Rights Reserved 联系方式:QQ:99515681 微信:codinghelp 电子信箱:99515681@qq.com
免责声明:本站部分内容从网络整理而来,只供参考!如有版权问题可联系本站删除。