# Lesson 10 — Design-Specific & Temporal Biases (v3 expanded)

*Companion-podcast transcript • Sarah & Kiffer* 
*~5308 words • ~28.7 min audio*

---

**Sarah:** Welcome back to Office Hours. I'm Sarah.

**Kiffer:** And I'm Kiffer. Today we're working through Lesson 10, Design-Specific and Temporal Biases. This is the lesson where the bias material from Lessons 7 through 9 gets pushed into specific designs, and where time itself becomes the protagonist.

**Sarah:** Let me set the table. Lessons 7 through 9 catalogued the three classical sources of bias one at a time. Causal-specification bias from Lesson 7. Selection bias from Lesson 8. Information bias from Lesson 9.

**Kiffer:** Right. And Lesson 10 is not introducing a fourth category. It's a tour of biases that are specific to particular study designs, or that arise from how time itself is handled in the analysis. The biases here usually combine elements from the three classical categories in characteristic ways.

**Sarah:** So we're not learning a new alphabet. We're learning where the letters from the previous lessons combine into specific words that live in specific kinds of studies.

**Kiffer:** That's a nice way to put it. Three sections. Section 1 is about randomized trial biases, the ones that survive even when you randomize. Section 2 is about time-related biases that haunt observational pharmacoepidemiology and screening evaluation. Section 3 is about time-window bias and the famous age-period-cohort identification problem.

**Sarah:** Let's start with Section 1. Randomized trial biases. And I want to flag the surprise up front, because students often miss it. Even randomized controlled trials, often described as the gold standard of causal inference, are not immune to bias.

**Kiffer:** And this is genuinely surprising the first time you hear it. Because the entire pitch for randomized controlled trials, often shortened to RCTs, is that randomization handles confounding for you. You assign people to treatment or control by random chance, and on average, every other characteristic, measured and unmeasured, balances out between the groups.

**Sarah:** Quick definition for anyone new. A randomized controlled trial is a study where the investigator decides, by some random process, who receives the intervention and who receives a comparison treatment or placebo. The randomness is the magic. It's what makes the treated group and the control group exchangeable in expectation.

**Kiffer:** Right. And the key insight of this section is that randomization addresses confounding by balancing prognostic factors at baseline. But biases can still arise after randomization.

**Sarah:** Wait, slow down. What does after randomization mean here? If randomization happens at the start, what's the threat?

**Kiffer:** Great question. The randomization assignment is one moment in time. Everything that happens after that moment, who actually gets enrolled, who actually swallows the pills, who measures the outcome, who reports symptoms, all of that is downstream of the random draw. And every step downstream is an opportunity for bias to sneak back in.

**Sarah:** So the randomization is doing real work, but it's not a force field that protects everything that follows.

**Kiffer:** Exactly. Let's start with allocation concealment. Quick definition. Allocation concealment refers to procedures that prevent the people enrolling participants from knowing the upcoming treatment assignments.

**Sarah:** And to make this concrete, who are the people enrolling participants? Usually a clinician, a research nurse, a recruiter at a clinic. They're the ones meeting potential participants and deciding whether each person is eligible to join the trial.

**Kiffer:** Right. And the concern is that if those recruiters know what the next assignment is going to be, they can subvert randomization. Subtly or not.

**Sarah:** Walk me through how that subversion would work.

**Kiffer:** Imagine the recruiter sees the next slot is treatment. They might think, this patient is really sick, I want them to get the treatment, let me make sure they qualify. Or, the next slot is control, this patient is fragile, I'll wait and enroll them when the next treatment slot comes up.

**Sarah:** So even with the best of intentions, the recruiter can selectively channel sicker patients into the treatment arm if they expect the treatment to help. Or they might exclude patients they don't think will tolerate the protocol. The result is that the groups, which were supposed to be balanced by chance, end up systematically different.

**Kiffer:** Right. Selection bias creeping in after the random number was drawn. Adequate methods of allocation concealment block that channel. First, central telephone randomization. The recruiter calls a central office, gives the patient's identifier, and only then receives the assignment. The recruiter never sees a list.

**Sarah:** Second, sequentially numbered sealed opaque envelopes. The recruiter opens envelope number 47 only after the patient has consented and been enrolled. Sealed and opaque because if you can hold the envelope up to a light or peek through the flap, the concealment is broken.

**Kiffer:** Third, pharmacy-controlled allocation. The pharmacy holds the randomization list. The recruiter dispenses a coded bottle that even they can't trace back to a specific arm.

**Sarah:** And inadequate methods include open random number tables that anyone can see, unsealed envelopes, and alternation by day of the week or by bed number. Anything where the recruiter can predict what's coming next.

**Kiffer:** Now here's the empirical evidence that this matters. Schulz and colleagues in 1995 analyzed 250 controlled trials drawn from 33 meta-analyses. Kenneth Schulz is an American methodologist who worked at the United States Centers for Disease Control and later helped design the Consolidated Standards of Reporting Trials, or CONSORT, guidelines, which are the standard reporting framework for randomized trials.

**Sarah:** And what did Schulz and colleagues find?

**Kiffer:** They found that trials with inadequate allocation concealment yielded odds ratios that were exaggerated by 30 to 40 percent on average compared to adequately concealed trials. So just the concealment failure, on its own, inflates the apparent effect by about a third to almost half.

**Sarah:** That's a huge finding. It means a substantial chunk of the effect estimates in the medical literature, in trials with poor concealment, is bias rather than signal.

**Kiffer:** Then Wood and colleagues in 2008 extended the analysis across 146 meta-analyses. They confirmed Schulz's finding and added a refinement. Lack of blinding inflated effect estimates for subjective outcomes by 15 to 25 percent. Both flaws combined could inflate effect estimates by up to 50 percent.

**Sarah:** Okay, that's a good segue to blinding. Let's define it carefully. Blinding, sometimes called masking, is the procedure that prevents participants, clinicians, or outcome assessors from knowing which group a participant was assigned to.

**Kiffer:** And blinding has levels. Single-blind means one party doesn't know, usually the participant. Double-blind means two parties don't know, usually the participant and the treating clinician. Triple-blind means three parties don't know, adding the outcome assessor or the data analyst to the list.

**Sarah:** And each level of blinding prevents a specific kind of bias. Walk us through them.

**Kiffer:** First, performance bias. Clinicians who know the assignment may provide differential co-interventions to unblinded groups. Extra attention. More frequent monitoring. Other treatments offered alongside. Anything that creates a systematic difference in care between the arms beyond the intervention being tested.

**Sarah:** Second, reporting bias. Participants who know what they were assigned to may report outcomes differently. If you think you're on the active drug, you might rate your pain as lower because you expect improvement. If you think you're on placebo, you might rate it higher because you're disappointed.

**Kiffer:** Third, detection bias. Outcome assessors who know the assignment may interpret ambiguous outcomes differently. If a radiologist reading a scan knows the patient was on the active drug, they may unconsciously read the scan as showing more improvement.

**Sarah:** And blinding matters most for subjective outcomes. Pain, functional status, quality of life. Outcomes that involve interpretation or self-report. For objective outcomes like all-cause mortality, lack of blinding matters less. Death is hard to fudge.

**Kiffer:** Right. Then placebo effects. Quick definition. The placebo effect refers to measurable improvements in participants who receive an inert treatment, driven by expectation, conditioning, and the therapeutic context.

**Sarah:** And the placebo effect is particularly pronounced in trials of pain and depression. Why those two specifically?

**Kiffer:** Because both are subjective experiences mediated by the brain. Pain perception is shaped by attention, expectation, and meaning. Depression involves mood, motivation, and self-perception, all of which are sensitive to context. So the act of being cared for, taking a pill, believing it might help, can produce real improvements that show up on rating scales.

**Sarah:** And the lesson also covers the nocebo effect. The mirror image. Adverse effects from inert treatments driven by negative expectation. People who think they might be getting an active drug report side effects even on placebo, because they're scanning for them.

**Kiffer:** Then the Hawthorne effect. Quick definition. The Hawthorne effect describes the phenomenon in which participants modify their behavior simply because they know they are being observed or studied.

**Sarah:** And there's a great backstory here. Where does the name come from?

**Kiffer:** It's named after the Western Electric Hawthorne Works, a factory in Hawthorne, Illinois, just outside Chicago. In the 1920s and 1930s, researchers ran a series of experiments there to test how various workplace conditions affected productivity. They changed lighting levels, break schedules, work hours, all sorts of things.

**Sarah:** And what they found was unexpected.

**Kiffer:** Productivity went up regardless of which change was introduced. Brighter lights, productivity up. Dimmer lights, productivity up. Longer breaks, productivity up. Shorter breaks, productivity up. The takeaway was that the workers were responding not to the specific intervention but to the fact that they were being studied. Being observed changed their behavior.

**Sarah:** And the modern textbook example is from Srigley and colleagues in 2014. They measured hand hygiene compliance among healthcare workers using two methods. Direct observation, where workers knew they were being watched. And electronic monitoring, which was covert.

**Kiffer:** And the contrast was dramatic. Compliance was around 70 percent or higher when workers knew they were being observed. About 25 percent with covert electronic monitoring. So the observed rate was nearly three times the unobserved rate.

**Sarah:** Which has serious implications for infection control studies. If the Hawthorne effect inflates compliance in all study arms, the true baseline behavior is obscured, and interventions may appear less effective than they would be in unmonitored real-world settings.

**Kiffer:** Okay. Then compliance and adherence bias. Sometimes called the healthy adherer effect.

**Sarah:** And the textbook case is the Coronary Drug Project. Tell us what that was.

**Kiffer:** The Coronary Drug Project was a large randomized trial run in the United States in the 1960s and 70s. It tested several drugs against placebo for preventing death after a heart attack. Standard randomized design. The intent was to figure out whether any of the drugs reduced mortality.

**Sarah:** And what did the trial reveal?

**Kiffer:** Among placebo-assigned participants, those who adhered to the placebo regimen had 15 percent lower mortality than placebo non-adherent participants. Adherent placebo takers lived longer than non-adherent placebo takers.

**Sarah:** Wait, that's wild. The placebo is inert. It can't possibly be doing anything biological. So how does adhering to a sugar pill reduce mortality?

**Kiffer:** It doesn't. The placebo isn't doing anything. The survival difference reflects that adherence itself is a marker for overall health behavior. People who reliably take their pills also tend to exercise more, eat better, attend follow-up appointments, manage stress better, take other medications consistently. Adherence is a proxy for a constellation of health-promoting behaviors.

**Sarah:** And Simpson and colleagues did a meta-analysis in 2006 that confirmed this across many trials. Good adherence to placebo was associated with lower mortality, with a pooled odds ratio of 0.56. So adherent placebo takers had roughly half the mortality of non-adherent placebo takers, across many studies pooled together.

**Kiffer:** Which is why intention-to-treat analysis, often shortened to ITT, is preferred over per-protocol analysis.

**Sarah:** Define those carefully because they're easy to mix up.

**Kiffer:** Intention-to-treat analysis keeps every randomized participant in their originally assigned group, regardless of whether they actually took the treatment. So if you were randomized to drug but never took it, in intention-to-treat you're still analyzed as part of the drug group.

**Sarah:** And per-protocol analysis only includes participants who actually completed the assigned treatment as planned. So non-adherent participants get dropped.

**Kiffer:** Per-protocol sounds reasonable. You want to know the effect of actually taking the drug, right? But it conflates drug effects with the healthy adherer effect. The drug looks better than it really is, because the people who actually took it are the ones who would have done better anyway.

**Sarah:** Whereas intention-to-treat preserves the comparison randomization created, and gives you an estimate of the effect of being assigned to the drug, in a real-world setting where some people don't adhere. That's usually closer to the question you actually care about.

**Kiffer:** Right. Then contamination. Control group participants partially receive the intervention. Particularly common in community trials.

**Sarah:** And the standard example is the COMMIT smoking cessation trial. The Community Intervention Trial for Smoking Cessation, run in the early 1990s in the United States and Canada.

**Kiffer:** Communities, not individuals, were randomized to receive a multi-year community-level smoking cessation intervention or to serve as controls. The intervention package included media campaigns, healthcare provider training, school programs, workplace policies, all targeted at the community level.

**Sarah:** And what went wrong with the controls?

**Kiffer:** The control communities were exposed to national anti-smoking campaigns running simultaneously. Independent of the trial. The Surgeon General's reports, public service announcements, state-level tobacco taxes, all happening in the broader environment. So the contrast between intervention and control communities was diluted, because the controls weren't really unexposed.

**Sarah:** And contamination biases toward the null. The intervention looks less effective than it would be in a fully unexposed control group, because the control isn't really unexposed.

**Kiffer:** And in the case of COMMIT, the trial famously failed to show a community-level effect on adult smoking, partly for this reason.

**Sarah:** Okay. That's Section 1. RCTs aren't immune. Allocation concealment, blinding, placebo, Hawthorne, healthy adherer, contamination. All of these can distort even a randomized study.

**Kiffer:** Section 2. Time-related biases in observational pharmacoepidemiology and screening.

**Sarah:** And first let's define pharmacoepidemiology, because that word does a lot of work in this section.

**Kiffer:** Pharmacoepidemiology is the branch of epidemiology that studies how drugs affect outcomes in real-world populations. Real-world meaning the people who actually use the drug in clinical practice, not the highly selected participants who enroll in randomized trials. Pharmacoepidemiology relies almost entirely on observational data, things like prescription databases, electronic health records, insurance claims.

**Sarah:** And the common engine behind the biases in this section is, a comparison group has been guaranteed extra survival, or extra opportunity to be exposed, by virtue of how the data were assembled. Not because of biology. Because of the way time was treated in the analysis.

**Kiffer:** Let's go deep on immortal time bias, because it's the most consequential bias in this entire section.

**Sarah:** Definition first. Immortal time bias occurs when a period of follow-up during which the outcome cannot occur is misclassified or improperly handled in the analysis.

**Kiffer:** Why is it called immortal? The term immortal refers to the fact that participants must survive, or remain event-free, long enough to be classified as exposed. During that pre-exposure window, they cannot, by definition, have died and still appeared in the exposed group. So that period is, definitionally, a period of guaranteed survival.

**Sarah:** Walk us through the textbook example. Statins after a heart attack.

**Kiffer:** Sure. Suissa in 2008 reanalyzed several widely cited observational studies. Samy Suissa is a Canadian pharmacoepidemiologist at McGill University in Montreal who has spent his career documenting time-related biases in drug studies. He's one of the most influential figures in this area.

**Sarah:** And what did Suissa show?

**Kiffer:** The original studies he reanalyzed had reported dramatic mortality reductions from post-heart-attack statin use. Reductions of 25 to 50 percent. Statins, just to define them, are a class of cholesterol-lowering drugs widely prescribed after cardiovascular events. The reported effect sizes were so large that they exceeded what randomized trials had shown for the same drugs.

**Sarah:** Which should have raised eyebrows immediately. If observational studies are showing bigger effects than randomized trials, something's usually off.

**Kiffer:** Right. And here's how the bias was built in. The studies classified patients as statin users based on prescriptions filled after hospital discharge. Patients who died before filling a prescription were automatically classified as non-users.

**Sarah:** Because they never filled a prescription, they couldn't be in the exposed group.

**Kiffer:** Exactly. And the survival time before the first prescription was either misattributed to the exposed group, inflating their person-time denominator, or excluded from analysis entirely. Either way, the exposed group ended up with built-in extra survival time that had nothing to do with the drug.

**Sarah:** Let me put it concretely. Imagine a patient is discharged from the hospital on January 1st and fills their first statin prescription on March 1st. Then they live for another two years. The two months between discharge and first prescription is immortal time. They couldn't have died during that window and still been counted as a statin user, because we defined statin user as someone who filled a prescription.

**Kiffer:** Right. And if those two months get credited to the statin group, you've manufactured a survival benefit out of pure bookkeeping. When Suissa reanalyzed the data using time-dependent exposure classification, where exposure status updates as the patient actually fills the prescription, the dramatic survival benefit was substantially reduced or eliminated.

**Sarah:** And in the lesson there's an R simulation that drives this home. Build a cohort of two thousand patients where the drug truly has zero survival benefit. Misclassify person-time the wrong way. A huge fake mortality reduction appears out of nothing. Run it correctly with time-dependent exposure and the rates are nearly identical.

**Kiffer:** The bias is engineered, not measured. Wrong person-time bookkeeping produced the apparent benefit. The fix is bookkeeping, not biology.

**Sarah:** What are the corrections for immortal time bias?

**Kiffer:** First, time-dependent analysis. Treat exposure as a time-varying covariate. Before the first prescription, the patient contributes person-time to the unexposed group. After the prescription, they contribute to the exposed group. Same patient, exposure status switches over time.

**Sarah:** Second, landmark analysis. Pick a fixed time point after cohort entry, say six months. Classify exposure status at that landmark. Anyone who died or was lost before the landmark is excluded entirely. Follow-up begins at the landmark, so both groups start with the same survival requirement.

**Kiffer:** Third, new-user designs. Restrict the cohort to new initiators of the drug versus new initiators of an alternative drug. Both groups start their exposure clock at the same time.

**Sarah:** And fourth, target trial emulation. Design the analysis to mimic the randomized trial you'd ideally run. Specify time zero, eligibility, treatment assignment, follow-up rules, all simultaneously, then map them onto your observational data.

**Kiffer:** Then lead-time bias. The first of three biases that haunt screening evaluation.

**Sarah:** Quick definition. Lead-time bias occurs when screening or early detection appears to improve survival simply because the diagnosis is made earlier, without actually changing the time of death. The lead time is the interval between screen detection and the time the disease would have been diagnosed clinically.

**Kiffer:** Walk through the two-patient example.

**Sarah:** Imagine two identical patients with the same cancer that develops at age 55 and causes death at age 70. Patient A is screened at age 58 and diagnosed early. Their survival from diagnosis is 12 years.

**Kiffer:** Patient B is not screened. Their cancer is diagnosed at age 65 when it presents with symptoms. Their survival from diagnosis is 5 years.

**Sarah:** Patient A appears to have lived more than twice as long with the disease. But the actual moment of death was the same. Both patients lived to exactly 70. Screening just moved the diagnosis date earlier.

**Kiffer:** And the textbook empirical example is prostate-specific antigen, often shortened to PSA, screening for prostate cancer. Following the introduction of PSA screening in the late 1980s, 5-year survival rates for prostate cancer in the United States rose from approximately 75 percent to over 99 percent. Eye-popping improvement.

**Sarah:** But prostate cancer mortality rates declined only modestly over the same period.

**Kiffer:** Right. And Etzioni and colleagues in 2002 estimated that lead-time bias accounted for the majority of the apparent survival improvement. Ruth Etzioni is an American biostatistician at the Fred Hutchinson Cancer Research Center in Seattle, who has spent her career modeling cancer screening biases.

**Sarah:** And the rule that comes out of this is, use mortality, not survival from diagnosis, as the endpoint when evaluating a screening program.

**Kiffer:** Then length-biased sampling. Screening preferentially detects slower-growing, less aggressive tumors.

**Sarah:** Why does that happen?

**Kiffer:** Because slower-growing tumors have a longer detectable preclinical phase. They sit in the body, large enough to be detected by screening, for longer. Which means at any given screening exam, they're more likely to be present and findable.

**Sarah:** And fast-growing tumors have a shorter preclinical window. They might progress from undetectable to symptomatic in a few months. So they're less likely to be caught at a routine screening, and more likely to present as what's called interval cancers, cancers that show up between scheduled screenings.

**Kiffer:** Right. So screen-detected cancers are systematically enriched for indolent disease. Even if screening did nothing for survival, screen-detected cancers would appear to have better outcomes simply because they're a different mix of biology.

**Sarah:** And overdiagnosis is the limit case of length-biased sampling.

**Kiffer:** Yes. Overdiagnosis means detection of disease that would never have progressed to cause symptoms or death during the patient's lifetime. Some prostate cancers, for example, are so slow-growing that the patient dies with the cancer rather than from it.

**Sarah:** And how do you detect overdiagnosis?

**Kiffer:** The only way is to compare incidence in screened versus unscreened populations over long follow-up. If a screening program is causing real benefit by catching genuine cancers earlier, you'd expect cumulative incidence to converge between screened and unscreened populations over time. If the screened population shows persistent excess incidence that never converges, that excess is the signature of overdiagnosis. Cancers that would have stayed silent forever are being detected and counted.

**Sarah:** Okay. Section 3. Time-window bias and age-period-cohort effects.

**Kiffer:** Time-window bias is the case-control cousin of immortal time bias. Both involve unequal handling of time, but they show up in different study designs.

**Sarah:** Quick reminder. A case-control study selects cases who already have the outcome, selects controls who don't, and compares prior exposure between the two groups.

**Kiffer:** Right. And time-window bias occurs in case-control studies when the exposure opportunity period differs between cases and controls. If cases have a systematically shorter, or longer, time window during which exposure could be ascertained, the comparison of exposure prevalence is distorted.

**Sarah:** Suissa and colleagues in 2006 showed this in case-control studies of statins and cancer.

**Kiffer:** Several case-control studies had reported protective effects of statins on various cancers. Controls were matched to cases on calendar date but had systematically longer exposure opportunity periods.

**Sarah:** Why longer?

**Kiffer:** Because cases were people who had developed cancer, and their exposure was ascertained up to their diagnosis. Controls, matched on calendar date but not on the structure of follow-up, were observed over longer or more recent windows. And because statin use was increasing rapidly across this period, controls observed over longer recent windows had a higher probability of having ever been on a statin.

**Sarah:** So statin exposure in the controls was inflated relative to cases, purely because the controls had more time to accumulate exposure.

**Kiffer:** Right. That created an artificial protective association. When the exposure windows were properly aligned between cases and controls, the apparent effect attenuated or disappeared.

**Sarah:** And the difference between time-window bias and immortal time bias matters. Immortal time bias is in cohort studies, where pre-exposure person-time gets credited to the exposed group, giving them a guaranteed survival window. Time-window bias is in case-control studies, where cases and controls have unequal opportunities to accumulate exposure.

**Kiffer:** Different mechanisms, same family. Both are about time slipping into the comparison through the back door.

**Sarah:** Corrections for time-window bias?

**Kiffer:** Match on exposure opportunity. Use incidence density sampling, where controls are selected from the risk set at the time of each case's event. Define fixed exposure windows, like 90 days before the index date, identical for cases and controls. Run sensitivity analyses with varying window lengths.

**Sarah:** Then the biggest conceptual idea in Section 3. Age-period-cohort effects, often shortened to APC effects. The famous identification problem.

**Kiffer:** Three kinds of effects to distinguish. Let me take each in turn.

**Sarah:** First, age effects.

**Kiffer:** Age effects are things that change with biological age. Cancer incidence rises with age, regardless of when you live. The risk of stroke increases with age. Bone density declines with age. These are biological processes tied to how long someone has been alive.

**Sarah:** Second, period effects.

**Kiffer:** Period effects are things that affect everyone alive at a particular calendar time. A vaccination campaign that protects all ages from measles in 2010. Tobacco regulations that reduce smoking exposure across the population in a given decade. The launch of a screening program. The introduction of seatbelt laws. Anything where the calendar year is the relevant axis.

**Sarah:** And third, cohort effects.

**Kiffer:** Cohort effects are things specific to a particular birth-year cohort. Growing up in an environment where smoking was already stigmatized affects lifetime smoking patterns differently than growing up when smoking was glamorous. Children born during a famine have lifelong elevated risk of certain conditions. Cohorts who came of age during the AIDS epidemic relate to safer-sex practices differently than cohorts before or after.

**Sarah:** And the smoking trends example is the classic worked example. Walk us through it.

**Kiffer:** Repeated cross-sectional surveys in the United States and the United Kingdom show that overall smoking prevalence has declined steadily since the 1960s. Surface-level read, that's a period effect. Anti-smoking campaigns took effect, smoking went down across the board.

**Sarah:** But examining trends by birth cohort reveals something more complex.

**Kiffer:** Right. Cohorts born in the 1920s and 1940s had peak smoking rates above 50 percent for men. They came of age in an era when smoking was glamorized, in films, in advertising, in returning soldiers' culture after the Second World War. They picked up the habit early and many never quit. The decline in smoking among this cohort, when it came, was a period effect, the anti-smoking campaigns of the 60s through 90s reducing their consumption.

**Sarah:** And cohorts born after 1970 never reached such high peak prevalence. That's a cohort effect.

**Kiffer:** Exactly. People who were 20 in 1990 had a different relationship to cigarettes than people who were 20 in 1950. The cultural environment they came of age in had already absorbed the lessons of the 1964 Surgeon General's report. Smoking was already on the way out as an aspirational behavior. Their lifetime smoking trajectory looks different, and that difference is locked in by birth year.

**Sarah:** And distinguishing the two matters for projection. A simple period model predicts everyone converging to the same low rate. A cohort model reveals that some older cohorts will maintain elevated rates until they die out, because they smoked heavily in their youth and the residual disease burden persists.

**Kiffer:** Which has consequences for everything downstream of smoking. Lung cancer incidence, chronic obstructive pulmonary disease, cardiovascular disease. The shape of the future depends on whether you read trends as period-driven or cohort-driven.

**Sarah:** And then the identification problem itself. This is where it gets technical, but the punchline is intuitive.

**Kiffer:** Age, period, and cohort are mathematically collinear. The relationship can be stated in plain language, your birth year equals the calendar year minus your age. Once you know any two of these factors, the third is determined.

**Sarah:** Which means if you try to fit a model with all three as separate linear effects, the model has infinite solutions. You can shift the effects around among the three, and the data will fit equally well.

**Kiffer:** It's not a data collection problem. You can have perfect data and still face this issue. It's a fundamental structural fact about how time works in repeated cross-sectional analysis.

**Sarah:** And the consequence is, you need theory, external data, or constrained models to disentangle the effects. Researchers might assume that the period effect is smooth, or that the cohort effect is zero in some reference cohort, or that age effects follow a known biological curve. There's no purely empirical way around it. You're always smuggling in an assumption.

**Kiffer:** Which is humbling, because age-period-cohort modeling is one of the workhorses of demography and chronic-disease epidemiology, and at its core, every analysis is making a theoretical commitment.

**Sarah:** Okay. Pulling the takeaways together.

**Kiffer:** Yeah. Let me list them. There are seven I'd want a student to leave with.

**Sarah:** Go ahead.

**Kiffer:** First. Randomized controlled trials are not immune to bias. Randomization handles confounding by balancing prognostic factors at baseline. But everything that happens after randomization, the enrollment, the treatment, the measurement, the analysis, is still susceptible. Allocation concealment failures and lack of blinding inflate effects by 30 to 50 percent according to Schulz and colleagues and Wood and colleagues. Subjective outcomes are most affected. Objective outcomes like mortality are more robust.

**Sarah:** Second. Per-protocol analyses can be confounded by the healthy adherer effect, even in randomized trials. The Coronary Drug Project showed that placebo adherers had 15 percent lower mortality than placebo non-adherers, which can't be a drug effect because the drug was inert. Adherence is a marker for overall health behavior. Intention-to-treat analysis preserves the benefits of randomization. Per-protocol confuses them.

**Kiffer:** Third. Placebo effects are real and large in trials of pain and depression. The Hawthorne effect, behavior change from being observed, can inflate compliance and obscure baseline behavior. Contamination, where controls partially receive the intervention, biases toward the null and was the central problem in the COMMIT smoking cessation trial.

**Sarah:** Fourth. Immortal time bias misclassifies pre-exposure survival in cohort studies of drugs. Suissa's 2008 reanalysis showed that the dramatic post-heart-attack mortality reductions reported for statins in early observational studies were largely an artifact of how time was handled. The fix is treating exposure as time-varying, using landmark analysis, applying new-user designs, or doing target trial emulation.

**Kiffer:** Fifth. Lead-time bias and length-biased sampling are why mortality, not survival from diagnosis, is the right endpoint for screening evaluation. The PSA screening story, where 5-year survival went from 75 percent to over 99 percent while mortality moved only modestly, is the classic signature of these biases combined with overdiagnosis.

**Sarah:** Sixth. Time-window bias is the case-control version of the immortal time problem. Always check that cases and controls had equal exposure ascertainment windows. The Suissa and colleagues 2006 statins-and-cancer reanalysis is the canonical example of how unequal windows can manufacture protective associations out of nothing.

**Kiffer:** And seventh. Age, period, and cohort effects are mathematically collinear. Cohort equals period minus age, your birth year equals the calendar year minus your age. The age-period-cohort identification problem is structural and requires theory or external constraints to resolve. There's no purely empirical solution. The smoking trends example shows why getting this right matters for projection.

**Sarah:** And one practical recommendation. Don't skip the R simulation on immortal time bias if it's in your version of the lesson. Building a 2,000-patient cohort where the drug has zero true survival benefit, then running the analysis the wrong way and watching a fake 25 to 50 percent mortality reduction appear, is the most concrete way to internalize the bookkeeping nature of this bias. You can't unsee it once you've made it happen on your own laptop.

**Kiffer:** And the broader meta-lesson here. Time, in observational research, is not just a backdrop. It's an active variable that has to be handled with the same rigor as exposure or outcome. Immortal time, lead time, length-biased sampling, time-window mismatch, period-cohort confounding. All of these emerge when time enters the comparison through a route that wasn't planned.

**Sarah:** Which is why design trumps analysis. The same slogan we keep hitting. The decisions you make at the moment you draw the comparison, who counts as exposed, when does follow-up start, how long is the exposure window, how do you handle birth cohort, all of those are the decisions that determine whether the answer you get is signal or artifact.

**Kiffer:** Next up is Lesson 11. Confounding and Statistical Inference. Where we close out the bias-focused lessons by going deep on the most studied threat to causal inference, and bring statistical inference back into focus alongside it.

**Sarah:** Take care, everyone.

**Kiffer:** See you there.