# Lesson 12: Repeated Measures Data
## v3 Expanded Podcast Transcript

**Sarah:** Welcome back to Office Hours. I'm Sarah.

**Kiffer:** And I'm Kiffer. Today we're working through Lesson 12, Repeated Measures Data. This is the final lesson of the course, and the final lesson of the whole three-course sequence, and in a lot of ways it's the most practically useful one for anyone who's going to do their own longitudinal analysis.

**Sarah:** Why is that?

**Kiffer:** Because repeated measures, where you measure the same person over and over again across time, is probably the most common form of clustering in health research. Clinical trials measure blood pressure at baseline, then at month one, three, six, and twelve. Cohort studies follow people for decades and re-measure their biomarkers, their lifestyle, their cognitive function. Daily symptom diaries collect dozens of observations per person. Once you start looking, repeated measures are everywhere.

**Sarah:** And this lesson is the one that ties together the random-effects machinery from Lessons 10 and 11 with the special structure that arises when the cluster is a person observed across time.

**Kiffer:** Right. Lessons 9 through 11 built up the general framework for clustered data. Identifying clustering, quantifying its impact, modeling continuous outcomes with linear mixed models, modeling discrete outcomes with generalized linear mixed models, and the marginal alternative through generalized estimating equations. Lesson 12 zooms in on a specific kind of cluster, the same subject measured repeatedly over time, and shows that this kind of cluster has its own special structure.

**Sarah:** What makes it different from generic clustering?

**Kiffer:** Time. The temporal ordering of measurements carries information that generic clustering doesn't have. If I tell you that observations one, four, and seven come from the same person, the random-intercept model from Lesson 10 treats those three observations as exchangeable. Their order is irrelevant. But in repeated measures, the order matters enormously. Observations that are close in time are more strongly correlated than observations that are far apart. That's autocorrelation, and it's the central feature that distinguishes repeated measures from generic clustering.

**Sarah:** Let's set up the conceptual scaffolding. In a repeated measures dataset, the cluster is the person. The observations within the cluster are the multiple time points. And the within-person observations are correlated because they all come from the same body, the same physiology, the same set of habits and exposures.

**Kiffer:** And there are at least three things going on that produce that within-person correlation. First, stable individual traits. Some people just have higher blood pressure than others, and that baseline difference shows up in every measurement. Second, slowly evolving health states. A person's blood pressure today is similar to their blood pressure yesterday because their physiology hasn't changed much. Third, measurement context. Same clinic, same cuff, same time of day, same observer. All of that produces correlation.

**Sarah:** Three sections to work through today. First, modeling longitudinal trajectories with random intercepts and random slopes. Second, the family of covariance structures that are specific to repeated measures, like the autoregressive structure. And third, special considerations like time-varying covariates, the within-person versus between-person distinction, missing data and dropout.

**Kiffer:** Plus a brief preview of dynamic models at the end. Cross-lagged panel and latent change score. These get more attention in advanced courses, but they're worth knowing the names of.

**Sarah:** Before we dive into models, though, the lesson recommends starting with descriptive plots, and I think that's worth flagging. The classic visualization here is the spaghetti plot.

**Kiffer:** Right. A spaghetti plot, sometimes called a profile plot, draws one line per person, connecting that person's measurements across visits. With 200 people and four visits, you end up with 200 squiggly lines on the page. It looks like a plate of spaghetti, hence the name. And what you're looking for is whether the lines are roughly parallel, which suggests random intercepts are sufficient, or whether they fan out, which suggests you need random slopes too. You're also looking for outlier trajectories, sudden jumps, and overall trends.

**Sarah:** And the lesson pairs that with a mean profile plot, which is just the average outcome at each time point, often separated by treatment group. It hides the individual variability but makes the population trend obvious.

**Kiffer:** And then the empirical correlation matrix. Compute the correlation between blood pressure at visit one and visit two across all subjects, then visit one and visit three, all the way out. Lay those out and you have a numerical fingerprint of the within-person dependence structure. That fingerprint is what tells you which covariance structure to start with.

**Sarah:** Good. Let's start with Section 1. Modeling longitudinal trajectories.

**Kiffer:** The basic mixed model for repeated measures is what we call a growth curve model. The phrase comes from developmental research, where it was used to model children's growth in height or weight over time, but it applies anywhere the question is about individual change.

**Sarah:** Walk us through the structure.

**Kiffer:** There are two layers. The first layer is the population-average trajectory. Imagine you took all 500 people in a longitudinal blood pressure study and computed the average systolic blood pressure at each visit. You'd get a curve. Maybe it slopes gently upward over the years of follow-up because the cohort is aging, or it slopes downward because everyone got assigned to a new medication. That average curve is the fixed-effects part of the model.

**Sarah:** And then?

**Kiffer:** And then the second layer is the variation around that average. Some people start high and stay high. Some start low and rise quickly. Some are flat. Each person has their own trajectory, and we model the deviations from the population average using random effects. Random intercepts let baseline levels vary across people. Random slopes for time let the rate of change vary across people.

**Sarah:** So if I had a population-average systolic blood pressure trajectory that rises from 130 millimeters of mercury at baseline to 138 at year five, the random intercept lets one person sit at 145 and another at 118 throughout. And the random slope for time lets one person rise faster than the average and another rise more slowly, or even fall.

**Kiffer:** Exactly. Random intercepts and random slopes for time. That's the heart of a growth curve model. And critically, those random effects are correlated. People who start higher might tend to rise more slowly. Or people with higher baselines might rise even faster. The model estimates that intercept-slope correlation.

**Sarah:** What does that correlation actually tell you scientifically?

**Kiffer:** It tells you something about heterogeneity in trajectories. A negative intercept-slope correlation says people with low baselines catch up over time, the trajectories converge. A positive one says people who start high pull further ahead, the trajectories diverge. In aging research, that's the difference between cognitive reserves leveling people out versus disparities widening with age.

**Sarah:** Got it. Now what if the average trajectory isn't a straight line?

**Kiffer:** Then you extend the time variable. The simplest extension is a quadratic. You add a term for time squared, and now the average trajectory can curve. It can rise and then plateau. Or fall and then bottom out. A common pattern in clinical trials is a steep initial response to treatment, then a leveling off. That's a curved trajectory, and a quadratic can capture it.

**Sarah:** What if the curvature isn't smooth like a quadratic?

**Kiffer:** Then you reach for splines. A spline is a piecewise polynomial. You break the time axis into segments and fit a smooth curve in each segment, with the constraint that the curves join smoothly at the boundaries. Splines are very flexible and let the trajectory take whatever shape the data demand.

**Sarah:** And piecewise linear models?

**Kiffer:** Piecewise linear is when you have known change points. Maybe an intervention happened at month six. Before month six, the slope is one thing. After month six, it's another. You include a term for time, plus a term for time after the change point, and now you can estimate two slopes. Pre-intervention rate of change and post-intervention rate of change. The difference between those two slopes is the effect of the intervention on the rate of change.

**Sarah:** Walk us through a worked example so this lands concretely.

**Kiffer:** Suppose you're running a wellness intervention trial. Two hundred patients, randomized to control or intervention, measured at four visits, baseline, six months, twelve months, and eighteen months. Outcome is systolic blood pressure. The fixed effects in your model are visit, treatment arm, the visit-by-arm interaction, age, and sex. The random effects are a random intercept for each person, and a random slope for visit within person.

**Sarah:** And the parameters that matter for the trial question?

**Kiffer:** The headline number is the visit-by-arm interaction. That tells you whether the rate of change in blood pressure differs between the intervention and control groups. If the intervention drops blood pressure by an extra one millimeter of mercury per six-month visit on average, that's the trial's effect. The fixed effect of visit alone tells you the average rate of change across both groups. The random slope variance tells you how much individual people deviate from their group average rate of change.

**Sarah:** And students often get tripped up here. They want to interpret the coefficient on visit as the effect of time on blood pressure. What's the right framing?

**Kiffer:** The coefficient on visit is the average within-person rate of change. It's not the effect of time in any causal sense. Time isn't a manipulable cause. The coefficient is descriptive. It tells you, on average across people in the cohort, how much blood pressure changed per visit. The interesting question is always whether that rate of change differs between groups, or depends on some predictor. That's what an interaction with time gets you.

**Sarah:** Let's move to Section 2. Covariance structures.

**Kiffer:** This is where repeated measures branches off from generic clustering. Because in repeated measures, the residual errors, the deviations of each observation from the model's prediction, aren't just independent within a person. They have temporal structure. Observations close in time have correlated residuals. Observations far apart have less correlated residuals. We need to model that pattern explicitly.

**Sarah:** Why does the random intercept alone not handle this?

**Kiffer:** A random intercept produces what's called compound symmetry. All pairs of observations within a person have exactly the same correlation, regardless of how far apart in time they sit. That's the same assumption underneath classic repeated measures ANOVA. And it's almost never true for real biological data. The correlation between blood pressure today and blood pressure tomorrow is much higher than the correlation between blood pressure today and blood pressure six months from now. Compound symmetry forces those to be equal, and that's wrong.

**Sarah:** So we need a structure where correlations decay with time gap. And before we get to the menu, there's an important conceptual point. Random slopes for time and an autoregressive residual structure are two different ways of accounting for within-person correlation, and sometimes they overlap.

**Kiffer:** Yes. A random slope for time produces a particular kind of within-person correlation pattern. People with steeper slopes generate observations that are more spread out at the extremes of time and tighter in the middle. That induces autocorrelation. So for some datasets, a random slope alone captures most of the temporal dependence, and you don't need an explicit autoregressive residual on top. For others, the random slope handles the trajectory variation but residual autocorrelation persists, and that's when you add autoregressive errors as well. The choice is empirical. Fit both, compare with the Akaike information criterion.

**Sarah:** Okay, with that flagged, what are the standard structured options?

**Kiffer:** Let me walk through five of them. First, compound symmetry. All pairs of observations have correlation rho. One parameter. Equivalent to a random intercept. Almost never realistic for true repeated measures.

**Sarah:** Second?

**Kiffer:** Second, autoregressive of order one, AR-1. The correlation between observations decays exponentially with time apart. So if rho is 0.7, then observations one time-step apart have correlation 0.7, two time-steps apart have correlation 0.49, three time-steps apart have correlation 0.34, and so on. One parameter, just rho. AR-1 is the natural default for many biological measurements because the underlying process tends to look something like an exponential decay back toward a stable mean.

**Sarah:** When you say one time-step, do you mean one visit?

**Kiffer:** AR-1 assumes equally spaced time points. Each visit is one time-step. If your visits are at month zero, three, six, and nine, AR-1 treats each consecutive pair as one step. If your visits are at month zero, one, three, and twelve, AR-1 isn't quite right because the gaps are unequal. There's a continuous-time variant called continuous AR-1 that handles uneven spacing by making the correlation depend on the actual time gap rather than just the count of steps.

**Sarah:** Good. Third structure?

**Kiffer:** Third, unstructured. Every pair of time points gets its own correlation parameter, no constraints. With four time points, that's six correlation parameters plus four variance parameters. Maximum flexibility. But the cost is the number of parameters grows quickly with the number of time points. With ten visits, you'd be estimating fifty-five covariance parameters. You need a lot of data to identify all of those.

**Sarah:** When would you use unstructured?

**Kiffer:** When you have few time points, large sample sizes, and either no theoretical reason to impose a structure, or the empirical correlation matrix doesn't fit any of the structured options. Unstructured is the most flexible and least efficient. AR-1 is the most parsimonious. The art is in finding a structure that fits the data without burning too many parameters.

**Sarah:** Fourth?

**Kiffer:** Fourth, Toeplitz. Toeplitz says the correlation between observations depends only on the time gap, not on which specific time points are involved. So observations one step apart all have correlation rho one. Two steps apart all have correlation rho two. And so on. With four time points, you'd estimate three correlation parameters. It's more flexible than AR-1, which constrains rho two to equal rho one squared, but more parsimonious than unstructured. Toeplitz is good when correlations decay with time gap but the decay isn't smoothly geometric.

**Sarah:** Fifth?

**Kiffer:** Fifth, independence. No correlation. Equivalent to ignoring the clustering entirely and analyzing the data as if every observation were independent. Wrong almost always for repeated measures, but worth knowing as the baseline against which everything else is compared.

**Sarah:** How do you actually choose between these?

**Kiffer:** Two-step process. First, look at the empirical correlation matrix. Compute the observed correlations between pairs of time points across people, lay them out in a matrix, and stare at the pattern. Are they all roughly equal? Compound symmetry. Do they decay smoothly with time gap? AR-1. Do they decay irregularly? Toeplitz. Is there no obvious pattern? Unstructured. Second step, fit candidate models and compare them with information criteria. Akaike information criterion, AIC, is the standard. Lower AIC means better fit accounting for the number of parameters. For nested structures, like AR-1 nested inside Toeplitz, you can use a likelihood ratio test.

**Sarah:** Let's work through a concrete example.

**Kiffer:** Six monthly blood pressure visits. The empirical correlations look like this. Visits one and two, correlation 0.72. Visits one and three, 0.55. Visits one and four, 0.42. Visits one and five, 0.36. Visits one and six, 0.31. So a clear decay with time gap. You fit AR-1 and you get rho around 0.73. Predicted correlations would be 0.73, 0.53, 0.39, 0.28, 0.20. Reasonable fit. AIC, say, 2,341. Then you fit compound symmetry. It forces all correlations to be equal at the average, around 0.52, and the AIC jumps to 2,398, much worse, because compound symmetry is misspecifying the structure. Then you fit Toeplitz. AIC of 2,338, slight improvement over AR-1 but at the cost of four extra parameters. Parsimony favors AR-1, so you'd report AR-1.

**Sarah:** And there's an interesting subtlety about combining random effects with covariance structures. Some combinations are redundant.

**Kiffer:** Right. Random intercepts plus compound symmetry errors are redundant. The random intercept already produces compound symmetry. You can't separately identify both. Random intercepts plus AR-1 errors are useful. Together they produce a structure where there's a baseline correlation that doesn't decay to zero, plus an additional autocorrelation that does decay. That captures more realistic biology than either alone. Unstructured errors plus random effects are pointless. The unstructured covariance already captures everything.

**Sarah:** And so-called covariance pattern models drop the random effects entirely and rely just on the structured covariance of the errors?

**Kiffer:** Exactly. Covariance pattern models are repeated measures models that have no random effects. All the within-person correlation is captured through the residual covariance structure. They're a legitimate alternative to mixed models with random effects, particularly when the focus is on the population-average trajectory rather than individual variation.

**Sarah:** Now Section 3. Special considerations. Time-varying covariates first.

**Kiffer:** Time-varying covariates are predictors that change over time within a person. Body mass index, BMI, measured at every visit. Smoking status that can switch. Medication doses that get adjusted. Income that fluctuates. Daily mood ratings. All of these are within-person time-varying.

**Sarah:** Why do they need special handling?

**Kiffer:** Because they conflate two different effects, and a naive mixed model will give you their weighted average instead of either of them separately.

**Sarah:** Walk us through what those two effects are.

**Kiffer:** Suppose your outcome is systolic blood pressure measured at four visits per person, and your time-varying covariate is BMI, also measured at four visits per person. There are two distinct things you could be asking about.

**Sarah:** First?

**Kiffer:** First, the within-person effect. When this person's BMI goes up between visits, what happens to their blood pressure? That's a within-person association. It's about temporal change inside one body.

**Sarah:** Second?

**Kiffer:** Second, the between-person effect. Comparing people whose average BMI across all visits is higher to people whose average BMI is lower, what's the difference in average blood pressure? That's a between-person association. It's about contrasts across bodies.

**Sarah:** And these can absolutely differ.

**Kiffer:** They can differ in magnitude, and they can even differ in sign. Classic example. Suppose people with higher average BMI tend to be in worse cardiovascular shape generally, including higher blood pressure. So the between-person effect is positive, more BMI, more blood pressure across people. But within a single person, suppose they go on a stricter weight-loss medication that drops their BMI but the medication also produces a slight blood pressure rise as a side effect. So within a person, when BMI goes down, blood pressure goes up. The within-person effect is negative. Between-person says higher BMI, higher pressure. Within-person says higher BMI, lower pressure. Same data, opposite signs.

**Sarah:** Yikes. And a standard mixed model with BMI as a single predictor?

**Kiffer:** Will give you a coefficient that's a weighted average of the two effects. It's not interpretable as either. You don't know whether you're estimating the within-person effect, the between-person effect, or some weird hybrid.

**Sarah:** What's the remedy?

**Kiffer:** Decompose the time-varying covariate into two pieces. For each person, compute their average BMI across all their visits. That's the person-mean. Then compute the deviation from that person-mean at each visit. That's the within-person component. Now you put both into the model. The coefficient on the person-mean is the between-person effect. The coefficient on the deviation is the within-person effect. Two separate coefficients, two separate scientific questions.

**Sarah:** And worked numerically?

**Kiffer:** Take a person whose BMI at the four visits is 28, 29, 30, 29. Their person-mean is 29. Their deviations are minus one, zero, plus one, zero. Across the whole sample, the person-mean varies between people, and that's the between-person variation. Within each person, the deviations vary across visits, and that's the within-person variation. The model now estimates both effects cleanly.

**Sarah:** This decomposition has different names in different fields.

**Kiffer:** Yes. In econometrics it's sometimes called the Mundlak decomposition or person-mean centering. In psychology it's sometimes called within-between centering or person-centering. Same idea. The point is that any time-varying covariate carries two distinct kinds of information, and a thoughtful analysis separates them.

**Sarah:** And this connects back to one of the recurring themes in this course.

**Kiffer:** It connects to the very first lesson of this material, where we talked about the difference between population-average effects and individual-level effects. Between-person and within-person. Marginal and conditional. These distinctions show up everywhere. Generalized estimating equations versus mixed models from Lesson 11. Population-attributable fractions versus individual risks. Public health policy versus clinical decision-making. The repeated-measures decomposition is one more place where the same distinction matters.

**Sarah:** Now let's talk about missing data and dropout. This is where longitudinal studies live or die.

**Kiffer:** Missing data is endemic in longitudinal studies. People miss a visit because they're sick. They drop out because they got better and don't see the point of follow-up. Or they got worse and can't make it. Or they moved. Or they died. The longer the follow-up, the more dropout you'll have. In a five-year study, you might lose forty percent of your sample by the end.

**Sarah:** And the question is whether that's a problem for inference.

**Kiffer:** Whether it's a problem depends on the missingness mechanism. We covered this earlier, when we talked about sampling and selection. Same framework applies to longitudinal dropout, just translated.

**Sarah:** Walk us through the three mechanisms in this context.

**Kiffer:** Missing completely at random, MCAR. The probability of missing a visit is unrelated to anything, observed or unobserved. Like a clerical error losing a chart. Rare in longitudinal data. If true, complete-case analysis is unbiased.

**Sarah:** Missing at random, MAR?

**Kiffer:** Missing at random says the probability of missing a visit can depend on observed data, including past values of the outcome and the predictors, but not on unobserved data, including the missing value itself. So if people with higher blood pressure at visit two are more likely to miss visit three, that's MAR, as long as visit two is observed. MAR is the assumption that justifies maximum-likelihood-based mixed models. They handle MAR data without bias, automatically, because the likelihood incorporates all the available information.

**Sarah:** And missing not at random, MNAR?

**Kiffer:** Missing not at random says missingness depends on the unobserved value itself, even after conditioning on observed data. Classic example. People drop out because they're feeling so bad that they can't make it to the clinic, and feeling bad reflects an unmeasured aspect of their health. MNAR can produce serious bias and isn't fixed by maximum likelihood. The remedy is sensitivity analyses. Pattern-mixture models. Selection models. You essentially try multiple plausible MNAR scenarios and see how much your conclusions move.

**Sarah:** And the practical guidance?

**Kiffer:** First, design to minimize dropout. Make follow-up convenient. Build relationships with participants. Use short instruments where you can. Second, when dropout happens, document its pattern. Why are people leaving? When in the study? Compare characteristics of completers and non-completers. Third, fit a model that handles MAR by default, which mixed models do. Fourth, do at least one sensitivity analysis under an MNAR scenario to see whether your headline result is robust. If it isn't, you have to discuss that openly.

**Sarah:** And generalized estimating equations, GEE, has a stricter assumption?

**Kiffer:** GEE assumes missing completely at random by default. That's a stronger assumption than MAR. So if you have a lot of dropout that's plausibly related to health status, mixed models are usually the safer choice. There are weighted GEE methods that relax the MCAR assumption, but they're more advanced and require knowing or estimating the dropout probability.

**Sarah:** Let's spend a few minutes on the brief preview of dynamic models. Cross-lagged panel and latent change score.

**Kiffer:** Dynamic models are about reciprocal relationships across time. The mixed models we've been talking about model trajectories of a single outcome with time-varying predictors. Dynamic models go further. They model how a predictor at one time affects the outcome at a later time, while simultaneously asking whether the outcome at one time affects the predictor at a later time.

**Sarah:** Walk us through cross-lagged panel.

**Kiffer:** Suppose you measure depression and physical activity at four visits. A cross-lagged panel model regresses depression at visit two on depression at visit one, physical activity at visit one, and so on across all visits. It also regresses physical activity at visit two on physical activity at visit one, depression at visit one, and so on. The cross-lagged paths, depression at visit one predicting activity at visit two and activity at visit one predicting depression at visit two, let you ask which direction the influence runs more strongly. Does depression drive activity? Does activity drive depression? Or both?

**Sarah:** And latent change score?

**Kiffer:** Latent change score models reformulate the analysis around the change between consecutive time points. Instead of modeling levels at each visit, you model the latent change from visit one to visit two, from visit two to visit three, and so on. And then you can ask whether predictors influence the change, or whether the level at one visit predicts the change to the next. It's a different angle on similar questions, and it forces you to think carefully about temporal sequence.

**Sarah:** These both go deeper in advanced longitudinal modeling courses?

**Kiffer:** They do. Dynamic structural equation modeling is a graduate-level topic. The reason to mention them here is just so students know the names and know what they're for. If your research question is about reciprocal influence across time, you have options beyond a simple mixed model.

**Sarah:** One more thing worth flagging before we wrap. Convergence.

**Kiffer:** Yeah. Repeated measures models with random intercepts, random slopes, and structured residual covariance can be hard to fit. Likelihood surfaces are not always well-behaved. You'll see warnings about singular fits, models failing to converge, or correlation parameters pinned at their boundaries. When that happens, the usual diagnosis is overparameterization. You're asking the data to identify more covariance parameters than it can support.

**Sarah:** And the remedies?

**Kiffer:** Simplify. Drop the random slope if it's barely identified. Switch from unstructured to AR-1. Centering the time variable at its mean often helps numerically. And read the warnings carefully. A boundary fit on the intercept-slope correlation, where it's pinned at minus one or plus one, often means the data don't actually support estimating that correlation, and you should fix it at zero. Convergence isn't usually a software problem. It's the model telling you that you're asking too much.

**Sarah:** Let's pull the takeaways together. There are four main ones I want students to leave with.

**Kiffer:** Go ahead.

**Sarah:** First takeaway. Repeated measures are clustered observations within persons, but with a temporal structure that generic clustering doesn't have. The cluster is the person. The observations within the cluster are the time points. The within-person observations are correlated, and that correlation has a pattern that depends on time gap. Mixed models with random intercepts and random slopes for time, also called growth curve models, are the standard tool. Random intercepts let baseline levels vary across people. Random slopes let rates of change vary. Together, they capture both individual differences in starting levels and individual differences in trajectories.

**Kiffer:** And the trajectory itself can be parameterized many ways. Linear in time. Quadratic. Splines. Piecewise linear with known change points. Choose the parameterization that matches the underlying biology and your scientific question.

**Sarah:** Second takeaway. Covariance structures matter. The within-person correlation pattern usually doesn't fit compound symmetry, the assumption baked into a random-intercept-only model. AR-1, autoregressive of order one, is the natural default for evenly spaced biological measurements, where correlations decay exponentially with time gap. Toeplitz allows lag-specific correlations without forcing geometric decay. Unstructured leaves all correlations free and is the most flexible but most data-hungry. Compound symmetry is rarely realistic. Independence ignores clustering and is almost always wrong.

**Kiffer:** Choose deliberately. Look at the empirical correlation matrix first. Fit candidate models. Compare with AIC for non-nested structures and likelihood ratio tests for nested ones. And remember that random effects and residual correlation structures interact. Random intercepts plus compound symmetry are redundant. Random intercepts plus AR-1 are complementary. Unstructured plus random effects is pointless.

**Sarah:** Third takeaway. Time-varying covariates can and should be decomposed into within-person and between-person components. The within-person effect, captured by deviations from a person's own mean, tells you how within-person change in the predictor relates to the outcome. The between-person effect, captured by the person-mean, tells you how people who differ on the average level of the predictor differ on the outcome. These can have different magnitudes and even different signs. A standard mixed model that uses the raw time-varying covariate conflates them and gives you an uninterpretable hybrid.

**Kiffer:** And this matters scientifically because the within-person effect is closer to a causal effect of within-person change. The between-person effect is more vulnerable to confounding by stable individual differences. Distinguishing them is essential for interpreting longitudinal data correctly.

**Sarah:** Fourth takeaway. Missing data and dropout are endemic in longitudinal studies. The mechanism matters. Missing completely at random is the strictest assumption and rarely realistic. Missing at random allows missingness to depend on observed data and is the assumption that justifies maximum-likelihood mixed models. Missing not at random allows missingness to depend on the unobserved value itself, and requires sensitivity analyses, not a clean fix. Mixed models handle MAR data automatically. GEE assumes MCAR by default. Document the dropout pattern, fit a model that respects the likely mechanism, and run at least one sensitivity analysis.

**Kiffer:** And keep in your back pocket the dynamic models. Cross-lagged panel for asking which direction the influence runs across time. Latent change score for modeling change between consecutive visits explicitly. Both are extensions of the longitudinal modeling toolkit and worth knowing exist.

**Sarah:** And one synthesizing point I want to make. The repeated measures lesson sits at the intersection of everything that came before in this course.

**Kiffer:** Yeah, walk us through that.

**Sarah:** Lesson 9 introduced clustering and the intraclass correlation coefficient. Lesson 10 built linear mixed models for continuous outcomes. Lesson 11 extended to discrete outcomes through generalized linear mixed models and to the marginal alternative through GEE. Each of those lessons treated clusters as exchangeable. Lesson 12 takes the same machinery and adds the temporal structure that makes repeated measures their own special case.

**Kiffer:** And that temporal structure is what lets you ask longitudinal questions. Does this person change over time? Does the rate of change differ between groups? Does a within-person change in BMI predict a within-person change in blood pressure? Those are the questions repeated measures are uniquely suited to answer, and they're often the questions that drive both clinical research and public health surveillance.

**Sarah:** And one practical note before we close.

**Kiffer:** Yeah?

**Sarah:** Don't skip the activity in the lesson with the wellness trial dataset. Two hundred patients, four visits, both a continuous outcome and a binary outcome. Fit the random-intercept model, then add the AR-1 correlation structure, compare correlation structures with AIC, and fit the binary outcome with both a generalized linear mixed model and GEE. Watching how the coefficients and standard errors change across specifications is the best way to internalize what these structures actually do.

**Kiffer:** And if there's one piece of code from the activity to spend extra time on, it's the AIC comparison. You fit the model with AR-1 errors, you update it with compound symmetry, you update it with unstructured. Then you compare. The output is a table of AIC values that tells you which structure best balances fit and parsimony. That's where the abstract concept of choosing a covariance structure becomes a concrete model selection decision.

**Sarah:** Anything else for students before we wrap?

**Kiffer:** Just the bigger picture. We've now built the full clustered and longitudinal analysis toolkit. Random effects for clustering. Generalized linear mixed models for discrete outcomes. Generalized estimating equations for marginal effects. Repeated measures with explicit covariance structures. Time-varying covariate decomposition. Missing data handling. That's the modern epidemiologic and biostatistical analyst's toolkit. With it, you can handle the vast majority of correlated-data scenarios you'll encounter in practice.

**Sarah:** And this isn't just the end of Lesson 12. This is the last lesson of the course, and the last lesson of the whole three-course sequence. So I want to spend a minute zooming out.

**Kiffer:** Please do. We've been at this for a long stretch with these students.

**Sarah:** Let's start with the arc of this material itself. Twelve lessons. Lesson 1 set up the structured analytic workflow, the discipline of moving from a question to a model to a defensible conclusion. Lessons 2 through 4 built the regression toolbox. Linear regression, then logistic, then survival and time-to-event. Lessons 5 through 8 worked through diagnostics, model building, and the harder cases like missing data and effect modification. And Lessons 9 through 12 tackled dependence head-on. Clustering, mixed models, generalized estimating equations, and finally repeated measures over time.

**Kiffer:** A structured workflow, a regression toolbox, and a way to handle dependence. That's the spine of this material.

**Sarah:** And then the larger arc of the series. This part of the series was about reading evidence. Learning to recognize what counts as a good study, what counts as bias, how to appraise a paper without being either too credulous or too cynical. This part of the series was about designing studies and surveillance systems. Sampling, measurement, screening, surveillance, causal frameworks. The producer's side of the table. And this part of the series was about analyzing the data those studies produce. Regression, survival, dependence, longitudinal change. So read, design, analyze. Three different vantage points on the same enterprise.

**Kiffer:** And the through-line in all three is the same one we've been pushing all year. Public health is about making defensible decisions under uncertainty. Every tool we've covered, from confidence intervals to causal diagrams to mixed models, is in service of that. You're learning to be honest about what the data can and cannot tell you, and to communicate that honestly to people who have to act on it.

**Sarah:** And to the students listening, thank you. Honestly. Showing up week after week, working through the readings, doing the activities, struggling through the harder lessons. None of this is easy material, and you've stuck with it.

**Kiffer:** You have. And whatever you do next, whether it's more graduate training, public health practice, clinical work, research, policy, you now have a vocabulary and a toolkit that lets you engage with quantitative evidence on your own terms. That's the goal. Not to make you statisticians, but to make you analysts and critics who can hold their own in a room full of experts.

**Sarah:** All right. Last sign-off of the series.

**Kiffer:** Take care, everyone. Be kind to your data and to each other.

**Sarah:** See you out there.