Title

Mini-course on Missing Data

Missing data is a pervasive issue in statistics and data analysis, arising in diverse contexts: survey
respondents may skip sensitive questions (e.g., income), sensors may fail, or latent variables (such as
random effects in mixed models or cluster membership in unsupervised learning) may be inherently
unobserved. Even in causal inference, unobserved potential outcomes can be framed as missing
data. The presentation focuses on statistical inference and how to ensure accurate estimates of
variances, confidence intervals, and p-values—rather than sole prediction, where the goal is typically
to minimize a loss function.
 

Three different mechanisms are typically distinguished and introduced: Missing Completely At Random (MCAR),
Missing At Random (MAR), and Missing Not At Random (MNAR). The choice of methods for handling missing data depends on the missing data mechanism, the data’s structure, and the analytical goal. 

We discuss simple methods, such as complete case analysis and
mean imputation, and their advantages and disadvantages. Further, likelihood-based approaches and
multiple imputation are introduced, which are widely recommended, at least in the statistical
community, for their robustness and flexibility. An overview of useful software packages is also
planned.

An important conclusion will be that all methods require careful application to ensure valid
inference.

----------------------------------- 

There will be a coffee-break from 11h to 11:30h.

Date and Venue

Start Date
Venue
FC1 029
End Date

Speaker

Christian Heumann

Speaker's Institution

Ludwig-Maximilian University of Munich