MATLAB MEX Function Reference
Kalman Filtering Example 2:
Estimating an SSM Using the EM Algorithm

This example estimates the normal SSM of the mink-muskrat data using the EM algorithm. The mink-muskrat series are detrended. Refer to Harvey (1991) for details of this data set. Since this EM algorithm uses filtering and smoothing, you can learn how to use the KALCVF and KALCVS functions to analyze the data. Consider the bivariate SSM:

{y}_t & = & {H}{z}_t + \epsilon_t \
{z}_t & = & {F}{z}_{t-1} + \eta_t
where H is a 2 ×2 identity matrix, the observation noise has a time invariant covariance matrix R, and the covariance matrix of the transition equation is also assumed to be time invariant. The initial state z0 has mean \mu and covariance \Sigma. For estimation, the \Sigma matrix is fixed as
[ 0.1 & 0.0 \
 0.0 & 0.1
 ]
while the mean vector \mu is updated by the smoothing procedure such that \hat{\mu} = {z}_{0| T}. Note that this estimation requires an extra smoothing step since the usual smoothing procedure does not produce {z}_{T|}.

The EM algorithm maximizes the expected log-likelihood function given the current parameter estimates. In practice, the log-likelihood function of the normal SSM is evaluated while the parameters are updated using the M-step of the EM maximization

{F}^{i+1} & = & {S}_t(1)[{S}_{t-1}(0)]^{-1} \
{V}^{i+1} & = & \frac{1}T ( {S}_t...
 ...y}_t - {H}{z}_{t| T})^' +
 {H}{P}_{t| T} {H}^' ] \
\mu^{i+1} & = & {z}_{0| T}
where the index i represents the current iteration number, and
{S}_t(0) & = & \sum_{t=1}^T ({P}_{t| T} +
 {z}_{t| T} {z}^'_{t| T}), \
{S}_t(1) & = & \sum_{t=1}^T ({P}_{t,t-1| T} +
 {z}_{t| T} {z}^'_{t-1| T})
It is necessary to compute the value of {P}_{t,t-1| T} recursively such that
{P}_{t-1,t-2| T} = {P}_{t-1| t-1} {P}^{*'_{t-2} +
 {P}^*_{t-1} ({P}_{t,t-1| T} -
 {F}{P}_{t-1| t-1}) {P}^{*'_{t-2}
where {P}^*_t = {P}_{t| t}{F}^'{P}^-_{t+1| t} and the initial value {P}_{T,T-1| T} is derived using the formula
{P}_{T,T-1| T} = [ {I}- {P}_{t| t-1} {H}^'
 ({H}{P}_{t| t-1} {H}^' + {{R}}) {H}
 ] {F}{P}_{T-1| T-1}
Note that the initial value of the state vector is updated for each iteration
{z}_{1|} & = & {F}\mu^i \
{P}_{1|} & = & {F}^i\Sigma {F}^{i' + {V}^i
The objective function value is computed as -2l in the IML module LIK. The log-likelihood function is written
\ell = -\frac{1}2 \sum_{t=1}^T \log(|{C}_t|) -
 \frac{1}2 \sum_{t=1}^T ({y}_t - {H}{z}_{t| t-1})
 {C}^{-1}_t ({y}_t - {H}{z}_{t| t-1})^'
where {C}_t = {H}{P}_{t| t-1}{H}^' + {{R}}.

The iteration history is shown in Output 1. As shown in Output 2, the eigenvalues of F are within the unit circle, which indicates that the SSM is stationary. However, the muskrat series (Y1) is reported to be difference stationary. The estimated parameters are almost identical to those of the VAR(1) estimates. Refer to Harvey (1991, p. 469). Finally, multistep forecasts of yt are computed using the KALCVF function.

The predicted values of the state vector zt and their standard errors are shown in Output 3.


% The mink-muskrat data set
y = [  0.10609     0.16794
      -0.16852     0.06242
      -0.23700    -0.13344
      -0.18022    -0.50616
       0.18094    -0.37943
       0.65983    -0.40132
       0.65235     0.08789
       0.21594     0.23877
      -0.11515     0.40043
      -0.00067     0.37758
      -0.00387     0.55735
      -0.25202     0.34444
      -0.65011    -0.02749
      -0.53646    -0.41519
      -0.08462     0.02591
      -0.05640    -0.11348
       0.26630     0.20544
       0.03641     0.16331
      -0.26030    -0.01498
      -0.03995     0.09657
       0.33612     0.31096
      -0.11672     0.30681
      -0.69775    -0.69351
      -0.07569    -0.56212
       0.36149    -0.36799
       0.42341    -0.24725
       0.26721     0.04478
      -0.00363     0.21637
       0.08333     0.30188
      -0.22480     0.29493
      -0.13728     0.35463
      -0.12698     0.05490
      -0.18770    -0.52573
       0.34741    -0.49541
       0.54947    -0.26250
       0.57423    -0.21936
       0.57493    -0.12012
       0.28188     0.63556
      -0.58438     0.27067
      -0.50236     0.10386
      -0.60766     0.36748
      -1.04784    -0.33493
      -0.68857    -0.46525
      -0.11450    -0.63648
       0.22005    -0.26335
       0.36533     0.07017
      -0.00151    -0.04977
       0.03740    -0.02411
       0.22438     0.30790
      -0.16196     0.41050
      -0.12862     0.34929
       0.08448    -0.14995
       0.17945    -0.03320
       0.37502     0.02953
       0.95727     0.24090
       0.86188     0.41096
       0.39464     0.24157
       0.53794     0.29385
       0.13054     0.39336
      -0.39138    -0.00323
      -1.23825    -0.56953
      -0.66286    -0.72363  ]';
% mean adjust series
t = size(y,2);
Ny = size(y,1);
Nz = Ny;
F = eye(Nz);
H = eye(Ny);
% observation noise variance is diagonal
Rt = 1e-5*eye(Ny);
% transition noise variance
Vt = .1*eye(Nz);
a = zeros(Nz,1);
b = zeros(Ny,1);
myu = zeros(Nz,1);
sigma = .1*eye(Nz);
itermat = [];
for iter=1:100
   % construct big cov matrix
   var = [Vt zeros(Nz,Ny); zeros(Ny,Nz) Rt];

   % initial values are changed
   z0 = F*myu;
   vz0 = F*sigma*F'+Vt;

   % filtering to get one-step prediction and filtered value
   [logl, pred, vpred, filt, vfilt] = kalcvf(y, 0, a, F, b, H, var, z0, vz0);
   logl = (-2*logl-Ny*log(2*pi))*t;

   % smoothing using one-step prediction values
   [sm, vsm] = kalcvs(y, a, F, b, H, var, pred, vpred);

   % store old parameters and function values
   myu0 = myu;
   F0 = F;
   Vt0 = Vt;
   Rt0 = Rt;
   logl0 = logl;
   itermat = [itermat; iter logl0 F0(:)' myu0'];

   % obtain P*(t) to get P_T_0 and Z_T_0
   % these values are not usually needed
   % See Harvey (1991 p154) or Shumway (1988, p177)
   jt1 = sigma*F'/vpred(:,:,1);
   p_t_0  = sigma+jt1*(vsm(:,:,1)-vpred(:,:,1))*jt1';
   z_t_0  = myu+jt1*(sm(:,1)-pred(:,1));
   p_t1_t = vpred(:,:,t);
   p_t1_t1 = vfilt(:,:,t-1);
   Kt = p_t1_t*H'/(H*p_t1_t*H'+Rt);

   % obtain P_T_TT1. See Shumway (1988, p180)
   p_t_ii1 = (eye(Nz)-Kt*H)*F*p_t1_t1;
   st0 = vsm(:,:,t)+sm(:,t)*sm(:,t)';
   st1 = p_t_ii1+sm(:,t)*sm(:,t-1)';
   st00 = p_t_0+z_t_0*z_t_0';
   cov = (y(:,t)-H*sm(:,t))*(y(:,t)-H*sm(:,t))'+H*vsm(:,:,t)*H';
   for i=t:-1:2
      p_i1_i1 = vfilt(:,:,i-1);
      p_i1_i  = vpred(:,:,i);
      jt1 = p_i1_i1*F'/p_i1_i;
      p_i1_i  = vpred(:,:,i-1);
      if (i>2)
         p_i2_i2 = vfilt(:,:,i-2);
      else
         p_i2_i2 = sigma;
      end
      jt2 = p_i2_i2*F'/p_i1_i;
      p_t_i1i2 = p_i1_i1*jt2'+jt1*(p_t_ii1-F*p_i1_i1)*jt2';
      p_t_ii1 = p_t_i1i2;
      temp = vsm(:,:,i-1);
      sm1 = sm(:,i-1);
      st0 = st0+(temp+sm1*sm1');
      if (i>2)
         st1 = st1+(p_t_ii1+sm1*sm(:,i-2)');
      else
         st1 = st1+(p_t_ii1+sm1*z_t_0');
      end
      st00 = st00+(temp+sm1*sm1');
      cov = cov+H*temp*H'+(y(:,i-1)-H*sm1)*(y(:,i-1)-H*sm1)';
   end

   % M-step: update the parameters
   myu = z_t_0;
   F = st1/st00;
   Vt = (st0-F*st1')/t;
   Rt = cov/t;

   % check convergence
   if norm((myu-myu0)./(myu0+1e-6),inf) < 1e-2 && ...
      norm((F-F0)./(F0+1e-6),inf) < 1e-2 && ...
      norm((Vt-Vt0)./(Vt0+1e-6),inf) < 1e-2 && ...
      norm((Rt-Rt0)./(Rt0+1e-6),inf) < 1e-2 && ...
      abs((logl-logl0)/(logl0+1e-6)) < 1e-3
      break
   end
end

fprintf('\n   SSM Estimation using EM Algorithm\n');
fprintf('\n   Iter  -2*log L      F11       F21       F12       F22      MYU11     MYU22\n');
fprintf('%6d   % .3f   % .4f   % .4f   % .4f   % .4f   % .4f   % .4f\n',itermat');
eval = eig(F0);
fprintf('\n   Real         Imag         MOD\n');
eval = [real(eval) imag(eval) abs(eval)];
fprintf('% .7f   % .7f   % .7f\n',eval');
var = [Vt zeros(Nz,Ny); zeros(Ny,Nz) Rt];

% initial values are changed
z0  = F*myu;
vz0 = F*sigma*F'+Vt;
itermat=[];

% multistep prediction
[logl, pred, vpred] = kalcvf(y, 15, a, F, b, H, var, z0, vz0);
for i=1:15
   itermat = [itermat; i pred(:,t+i)' sqrt(diag(vpred(:,:,t+i)))'];
end
fprintf('\n  n-Step    Z1_T_n       Z2_T_n        SE_Z1        SE_Z2\n');
fprintf('%6d    % .7f   % .7f   % .7f   % .7f\n',itermat');
fprintf('\n');

Output 1: Iteration History
       SSM Estimation using EM Algorithm
    
       Iter  -2*log L      F11       F21       F12       F22      MYU11     MYU22
         1    4399.284    1.0000    0.0000    0.0000    1.0000    0.0000    0.0000
         2   -835.932    0.7936    0.3260   -0.6458    0.5128    0.0354    0.0560
         3   -841.806    0.7961    0.3256   -0.6516    0.5146    0.1159    0.0899
         4   -846.091    0.7959    0.3256   -0.6522    0.5143    0.1882    0.1023
         5   -849.036    0.7951    0.3255   -0.6527    0.5140    0.2472    0.1031
         6   -850.807    0.7940    0.3254   -0.6529    0.5139    0.2925    0.0985
         7   -851.748    0.7929    0.3253   -0.6530    0.5138    0.3257    0.0920
         8   -852.185    0.7920    0.3252   -0.6529    0.5137    0.3491    0.0854
         9   -852.352    0.7913    0.3251   -0.6528    0.5137    0.3650    0.0795
        10   -852.391    0.7908    0.3251   -0.6526    0.5137    0.3754    0.0748
        11   -852.381    0.7904    0.3251   -0.6525    0.5137    0.3819    0.0711
        12   -852.359    0.7902    0.3251   -0.6524    0.5137    0.3858    0.0684
        13   -852.339    0.7900    0.3251   -0.6524    0.5137    0.3880    0.0666
        14   -852.325    0.7900    0.3251   -0.6523    0.5137    0.3891    0.0653
        15   -852.317    0.7899    0.3251   -0.6523    0.5137    0.3895    0.0644
    

Output 2: Eigenvalues of F Matrix

       Real         Imag         MOD
     0.6518034    0.4393286    0.7860390
     0.6518034   -0.4393286    0.7860390
    

Output 3: Multistep Prediction

      n-Step    Z1_T_n       Z2_T_n        SE_Z1        SE_Z2
         1    -0.0515574   -0.5872360    0.2447541    0.2370773
         2     0.3423320   -0.3184214    0.3145872    0.2908073
         3     0.4781147   -0.0522676    0.3669531    0.3105250
         4     0.4117554    0.1286008    0.4015952    0.3217928
         5     0.2413572    0.1999359    0.4188527    0.3316971
         6     0.0602279    0.1811784    0.4258845    0.3392027
         7    -0.0706097    0.1126519    0.4296745    0.3433074
         8    -0.1292577    0.0349109    0.4330097    0.3450385
         9    -0.1248728   -0.0240923    0.4357101    0.3459133
        10    -0.0829210   -0.0529761    0.4372902    0.3466615
        11    -0.0309424   -0.0541737    0.4379789    0.3472920
        12     0.0108965   -0.0378889    0.4383025    0.3476799
        13     0.0333222   -0.0159205    0.4385639    0.3478544
        14     0.0367061    0.0026559    0.4387925    0.3479324
        15     0.0272616    0.0132986    0.4389413    0.3479917
    

See also

KALCVF performs covariance filtering and prediction

KALCVS performs fixed-interval smoothing

Getting Started with State Space Models

Kalman Filtering Example 1: Likelihood Function Evaluation

References

[1]  Harvey, A.C., Forecasting, structural time series models and the Kalman filter, Cambridge: Cambridge University Press, 1991.

[2]  Shumway, R.H., Applied Statistical Time Series Analysis, Englewood Cliffs, NJ: Prentice-Hall, 1988.

Return to main page