Forecasting Methods
ARIMA Models
State Space Models (SSM)
Kalman Filter
Estimation & Optimization
100

In a moving average, observations within the window get this weight, while observations outside the window get this weight.

What is equal weight (1/k) inside the window and zero weight outside?

100

A time series has this property when its mean, variance, and autocorrelation structure remain constant over time.

What is stationarity?

100

SSM has two main equations: this one links observed data to hidden states, and this one describes how states evolve over time.

What are the observation equation and transition equation (or state equation)?

100

This factor determines how much you trust the new observation versus your model's prediction, acting as a weight between 0 and 1.

What is the Kalman Gain (Factor)?

100

This is the probability of observing the actual data given specific parameter values, answering "How likely is this data if the parameters were θ?"

What is likelihood?

200

This smoothing method assigns decaying weights to all past observations, with the most recent getting weight α and none ever receiving exactly zero weight.

What is exponential smoothing?

200

You apply this transformation—taking Y_t minus Y_{t-1}—to remove trends and restore stationarity, and the number of times you do this becomes the "d" in ARIMA(p, d, q).

What is differencing (or first differencing)?

200

These are the two main differences between ARIMA and SSM: ARIMA uses observed/unobserved variables and constant/time-varying parameters.

What is: ARIMA uses observed variables and constant parameters; SSMuses unobserved variables and time-varying parameters?

200

The time update does this to the state estimate, while the information update does this using the new observation.

What is: projects it forward using the transition equation (time update), and corrects it using the new observation (information update)?

200

We optimize this function instead of likelihood itself because it converts products to sums, preventing numerical underflow.

What is the log-likelihood function?

300

The Holt-Winters αβ-filter has these two equations, not just a one-level equation.

What are: (1) Level update: L_t = L_{t-1} + b_{t-1} + α·e_t, and (2) Trend update: b_t = b_{t-1} + β·e_t?

300

 The model Y_t = φ₁Y_{t-1} + φ₂Y_{t-2} + ε_t + θε_{t-1} is identified as this ARIMA(p, d, q).

What is ARIMA(2, 0, 1)?

300

In the local level model, both Z_t and T_t equal this value, making the observed value equal to level plus noise.

What is 1 (scalar)?

300

High Kalman Gain (close to 1) means this about observation and prediction reliability, while low Kalman Gain means the opposite.

What is: observation is reliable and prediction is uncertain (high K), observation is noisy and prediction is reliable (low K)?

300

MLE algorithms can converge to local maxima instead of global maxima, fail to converge, or converge slowly if you don't have these.

What are good starting values?

400

The full Holt-Winters method with α, β, and γ parameters produces this type of pattern, following the trend but oscillating around it.

What is a cyclic trend (or seasonal pattern)?

400

PACF cuts off after lag p to identify this component, while ACF cuts off after lag q to identify this component.

What is AR(p) for PACF and MA(q) for ACF?

400

For the local level plus trend model, the state vector contains these two elements, and the transition matrix T_t has this specific structure.

What are [L_t, b_t]ᵀ (level and trend), and T_t = [[1, 1], [0, 1]]?

400

The notation t|(t-1) means this type of estimate, while t|t means this type, and (t-h)|t where 0<h<t means this type.

What are: predicted/prior estimate, filtered/updated estimate, and smoothed estimate?

400

Simulated Annealing is this type of method, while BFGS is this type of method.

What is derivative-free (Simulated Annealing) and derivative-based (BFGS)?

500

This practice combines predictions from multiple models rather than selecting one "best" model, and research shows it often outperforms even the best individual model.

What is forecast combination (or ensemble forecast)?

500

This is the forecasting equation for ARIMA(1, 1, 0): ŷ_{t+1} equals this expression in terms of Y_t, Y_{t-1}, and φ₁.

What is ŷ_{t+1} = (1 + φ₁)Y_t - φ₁Y_{t-1}?

500

For a linear regression model WITHOUT time-varying parameters in SSM, you set this matrix Q_t to this special value to ensure parameters never change.

What is Q_t = 0 (zero matrix)?

500

In Bayesian terms, the posterior equals this relationship between likelihood and prior

What is Posterior ∝ Likelihood × Prior?

500

In the context of State Space Models, this method requires iterative maximization and may have convergence problems.

What is Maximum Likelihood Estimation (MLE)?