HSCI 230 — Lesson 5

Cohort
Studies

Evaluating Epidemiological Research

Kiffer G. Card, PhD, Faculty of Health Sciences, Simon Fraser University

Learning objectives for this lesson:

  • Distinguish between open and closed source populations as they relate to cohort study design
  • Describe the major design features of risk-based and rate-based cohort studies
  • Identify hypotheses and population types consistent with risk-based and rate-based cohort studies
  • Elaborate the principles used to select and measure the exposure in cohort studies
  • Design and implement a valid cohort study to investigate a specific hypothesis

This course was developed by Kiffer G. Card, PhD, as a companion to Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.

Reference

Glossary — Key Terms, People & Concepts

📚 Reference page — available throughout the lesson

This glossary collects the key concepts, people, and ideas you will meet in this lesson. Use it as a reference while you work through the material, or as a review before assessments. Type in the search box to filter entries.

Key Concepts & Ideas
Cohort A defined group of people followed over time. In a cohort study, the cohort is classified by exposure status at baseline (or during follow-up) and then watched for outcomes.
Closed (Fixed) Cohort A cohort with fixed membership, all of whom are followed (or attempted to be followed) for a defined window. The natural setting for risk-based analyses.
Open (Dynamic) Cohort A cohort whose membership changes over time as people enter and leave. Person-time is the appropriate denominator and incidence rates the natural measure.
Prospective Cohort A cohort study in which exposure is measured at the start, and outcomes are then watched for as time passes. Best protection against differential measurement of exposure.
Retrospective (Historical) Cohort A cohort study assembled using existing records about exposure that occurred in the past, with outcomes also already observed. Faster than prospective designs but constrained by available data quality.
Ambidirectional Cohort A cohort study that uses both retrospective and prospective elements — e.g., reconstructing past exposures from records, then continuing prospective follow-up for new outcomes.
Exposure A factor whose effect on a health outcome is being investigated. In cohort studies, exposure status is fixed (or measured longitudinally) before the outcome occurs.
Outcome The health state or event whose occurrence the cohort is being followed for. Must be defined operationally and measured consistently across exposure groups.
Person-Time The sum of time each individual is observed and at risk for the outcome — e.g., person-years. Denominator for incidence-rate calculations in open cohorts.
Loss to Follow-Up Participants who can no longer be observed before the study ends. Threatens validity if loss is differential by exposure or outcome.
Censoring When follow-up ends before the outcome is observed. Right-censoring (most common) is handled in survival analysis; informative censoring biases estimates.
Risk Ratio (RR, Relative Risk) The cumulative incidence in the exposed divided by that in the unexposed. The natural effect measure from a closed cohort.
Incidence Rate Ratio (IRR) The incidence rate in the exposed divided by the incidence rate in the unexposed. The natural effect measure from an open cohort with person-time follow-up.
Hazard Ratio (HR) A ratio of instantaneous failure rates — the effect measure produced by Cox proportional-hazards models. Often interpreted similarly to an IRR when the proportional-hazards assumption holds.
Risk Difference (Attributable Risk) Cumulative incidence in the exposed minus that in the unexposed. Captures the absolute, not relative, public-health impact of exposure.
Population Attributable Fraction (PAF) The proportion of disease in the population that would be eliminated if the exposure were removed (assuming causality). Combines effect size with how common the exposure is.
Confounding A distortion of the exposure–outcome association by a third variable associated with both. Cohort studies handle it through restriction, stratification, and multivariable adjustment.
Healthy-Worker Effect A specific selection bias in occupational cohorts: people who are employed are systematically healthier than the general population, biasing comparisons against external referents.
Immortal Time Bias A specific bias arising when, by design, members of one exposure group cannot have the outcome during a stretch of follow-up — e.g., classifying treatment status using post-baseline information.
External Comparison Group A reference group drawn from outside the cohort (e.g., national rates) when an internal unexposed group is unavailable. Used in occupational cohorts; vulnerable to the healthy-worker effect.
Internal Comparison Group An unexposed (or differently exposed) reference group sampled from the same source population as the exposed. Generally less biased than external comparisons.
STROBE Checklist Reporting guideline (Strengthening the Reporting of Observational Studies in Epidemiology) with specific items for cohort designs — sampling, exposure measurement, follow-up, statistical methods (von Elm et al, 2007).
Methods & Study Designs
Cohort Study An observational design that classifies individuals by exposure and follows them over time to compare outcome occurrence. The closest observational analogue to an experiment.
Risk-Based Cohort Set in a closed cohort with full follow-up. Reports cumulative incidence and the risk ratio.
Rate-Based Cohort Set in an open or dynamic cohort with person-time follow-up. Reports incidence rates and the incidence-rate ratio — well suited to long follow-up with entries, exits, and censoring.
Kaplan–Meier Estimator A nonparametric estimator of the survival function from time-to-event data, accommodating right-censoring. Visualized as the familiar “step” survival curve.
Cox Proportional-Hazards Model A semi-parametric regression for time-to-event data that estimates hazard ratios while leaving the baseline hazard unspecified. Workhorse of modern cohort analysis.
Key People & Cohorts
Framingham Heart Study (1948–) A landmark prospective cohort begun in Framingham, Massachusetts that gave epidemiology the term “risk factor” (Kannel et al, 1961) and identified hypertension, smoking, cholesterol, and diabetes as major drivers of cardiovascular disease — see Dawber, Meadors, & Moore (1951) for the original design paper and Mahmood et al (2014) for a historical overview.
British Doctors Study (1951–2001) Doll & Hill's (1954) prospective cohort of UK physicians that established smoking as a cause of lung cancer. Followed for half a century with extraordinary retention; the 50-year follow-up appears in Doll et al (2004).
Nurses' Health Study (1976–) A massive prospective cohort of US nurses that has produced foundational evidence on diet, hormones, and chronic disease (Colditz, Manson, & Hankinson, 1997).
Richard Doll (1912–2005) British epidemiologist whose case-control and cohort work with Bradford Hill established the smoking–lung cancer link and modeled rigorous long-term cohort follow-up.
Austin Bradford Hill (1897–1991) British statistician and epidemiologist; co-author with Doll of the British Doctors Study, and author of the “Hill viewpoints” for assessing causality from observational data (Hill, 1965).
David Cox (1924–2022) British statistician who introduced the proportional-hazards model that bears his name (Cox, 1972) — one of the most cited statistical methods in cohort analysis.
No matching entries. Try a different search term.
Section 1

Introduction & Cohort Study Design

⏱ Estimated reading time: 15 minutes

Introduction and Overview

Lesson 4 ended with a promise: cohort studies invert the case-control logic by sampling on the exposure rather than the disease, and that inversion lets us measure incidence directly without the rare-disease assumption that complicated odds-ratio interpretation. This lesson cashes that promise. Across four content sections we walk from the basic logic of cohort design (Section 1), to the choice between risk-based and rate-based flavors that should now feel familiar from Lesson 4 (Section 2), to the surprisingly difficult problem of measuring exposure in a longitudinal setting (Section 3), and finally to the practical questions of comparability, follow-up, outcome ascertainment, and analysis (Section 4). The unified-design discipline from Lesson 3 still applies; the lessons of Lesson 1 about pre-specified analysis plans apply with extra force, because cohort studies often run for decades and offer many opportunities for selective reporting.

Learning Objectives

  • Describe the fundamental logic of the cohort study design.
  • Distinguish between open and closed source populations.
  • Differentiate between prospective and retrospective cohort designs.
  • Recognise how cohort studies relate to controlled trials.

What Is a Cohort Study?

The word cohort denotes a group of study subjects that has a defined characteristic in common. In epidemiological study design, that characteristic is usually exposure status. In a cohort study, we follow subjects from exposure to outcome (Grimes and Schulz, 2002).

▸ INTERACTIVE STORY — THE TOWN THAT WAS FOLLOWED Open full screen ↗

Walk through the Framingham Heart Study from 1948 enrollment to three generations of follow-up. Next ▶ advances scenes.

A 7-scene retelling of the most famous cohort study ever launched: town enrollment (Dawber, Meadors, & Moore, 1951), baseline measurements, decades of follow-up, incidence comparisons, the birth of the term "risk factor" (Kannel et al, 1961), and the three generations still under study today (Mahmood et al, 2014).

Key Idea

A cohort study closely resembles a controlled trial — without the randomisation of exposure. We start with subjects who do not yet have the disease, classify them by exposure, follow them forward in time, and compare the frequency of the outcome between exposure groups.

Most frequently, the outcome is the occurrence of a specific disease, but cohort studies can also examine outcomes such as birth weight, body mass index, blood pressure, or quality of life. Subjects are usually individuals, but can also be groups (e.g., families).

The cohort design's modern reputation rests on a small number of landmark studies that the rest of this lesson will return to repeatedly: the Framingham Heart Study (Dawber, Meadors, & Moore, 1951), which gave epidemiology the term “risk factor” (Kannel et al, 1961); the British Doctors Study (Doll & Hill, 1954; Doll et al, 2004), which followed UK physicians for 50 years and pinned down the smoking–lung cancer link; the Whitehall II civil-servant cohort (Marmot et al, 1991), which exposed a graded socioeconomic gradient in chronic disease; the Nurses' Health Study (Colditz, Manson, & Hankinson, 1997); the multinational EPIC cohort (Riboli et al, 2002); and the recent generation of population biobanks — UK Biobank (Sudlow et al, 2015) and the Canadian Longitudinal Study on Aging (Raina et al, 2019).

Source Population (disease-free) Exposed cohort Non-exposed cohort Follow forward in time → Diseased / Non-diseased Diseased / Non-diseased Compare disease frequency

Figure — The logic of cohort design: classify disease-free subjects by exposure, follow them forward, compare the disease frequency between groups.

Selecting the Study Group

How we select the cohort depends on what we know in advance. The three flip cards below name the standard choices — click each one and notice that the choice flows from the data already in hand, not from any abstract preference for one design over another.

Two-Cohort Design
Click to learn more
Single (Longitudinal) Cohort
Click to learn more
Virtual Cohort
Click to learn more

In both two-cohort and single-cohort designs, after selecting subjects we (1) verify they meet inclusion criteria, (2) confirm exposure status, (3) ensure they do not yet have the outcome, then (4) follow them for a defined period and compare incidence between exposure groups.

Whichever cohort structure you pick, the next decision is whether the follow-up has already happened or whether you will be doing it as the study runs.

Prospective vs. Retrospective Designs

Cohort studies can be conducted either way, depending on whether suitable records already exist (Euser et al, 2009). The two tabs below put the trade-offs side by side.

In a prospective cohort study, the disease has not yet occurred when the study begins. Subjects are recruited, exposure is assessed at baseline, and they are followed forward in time as outcomes develop.

Advantages: Allows more detailed information-gathering and careful recording of exposure, confounders, and outcome timing (see Examples 8.6, 8.7, and 8.9).

Disadvantages: Time-consuming and expensive; vulnerable to losses to follow-up over long study periods.

In a retrospective cohort study, the follow-up period has already ended and the disease event has already occurred when subjects are selected (Hudson et al, 2005). Investigators reconstruct exposure and outcome from existing records.

Advantages: Faster and cheaper; useful when good historical records exist (Examples 8.1, 8.4, 8.5).

Disadvantages: Requires suitable existing databases; depth of information is limited to what was recorded.

Beyond the timing question is a structural one about the population itself: does its membership stay fixed for the duration of follow-up, or do people enter and leave? You met this distinction in Lesson 4; it returns here as a more central concern, because cohort follow-up is what makes it operational.

Open vs. Closed Source Populations

The nature of the source population determines the appropriate design. This is a critical decision that affects everything from sample-size calculations to the choice of analytic methods. Read the table below as a checklist for matching disease type to design type — chronic outcomes almost always require open-population, rate-based handling, and Section 2 builds out exactly what that requires.

FeatureClosed PopulationOpen Population
MembershipFixed at start of studySubjects can enter and leave
Follow-upAll subjects observed for full risk periodVariable time-at-risk per subject
Best disease typeShort risk period (e.g., outbreaks)Long or chronic risk period (e.g., cancers)
Disease frequencyRisk (cumulative incidence)Rate (incidence density)
Acceptable lossesFew or none preferred (<10%)Time-at-risk accounted for explicitly

Key Examples

Three published examples bring the design choices we have just enumerated into one place. We will refer back to these throughout the lesson by number, so it is worth pausing on each one to identify which boxes the investigators ticked: prospective vs. retrospective, two-cohort vs. single, open vs. closed, risk-based vs. rate-based.

Example 8.1 — Retrospective Risk-Based (Discharge Against Medical Advice)

Choi et al (2011) conducted a hospital-based cohort study where the exposure was discharge against medical advice (DAMA) versus discharged with medical advice (DWMA). The outcome was readmission within 14 days. Each DAMA patient was matched with one DWMA patient by 10-year age group, gender, and clinical characteristics. Because all patients were observable for the full 14-day risk period, this is a classic risk-based design. Conditional logistic regression accounted for the matching. Result: 26% of DAMA patients were readmitted within 14 days versus only 3% of DWMA patients.

Example 8.2 — Continuous-Scale Outcome (Environmental Tobacco Smoke)

Crane et al (2011) conducted a retrospective cohort study based on interviews with 11,000+ women who gave birth in two Canadian provinces (2001–2009). Eleven per cent self-declared exposure to environmental tobacco smoke. Outcomes included infant body dimensions, Apgar scores, respiratory distress syndrome, and stillbirth. Multiple regression was used to control for confounders. Tobacco smoke was associated with lower birth weight, smaller body size, and increased stillbirths. Note: when outcomes are on a continuous scale (e.g., birth weight), the cohort design still applies — we just use linear rather than logistic regression.

Example 8.3 — Propensity-Score Matching (Antipsychotics & Falls)

Mehta et al (2010) used a population-based retrospective cohort to investigate falls and fractures in adults ≥50 years. The exposure was atypical versus typical antipsychotic agents. More than 60 covariates were combined into a propensity score, and the “Greedy 5-1 matching technique” was used to match subjects with similar scores. Each exposure group contained 5,580 people. While the hazard ratio did not differ significantly between drug classes, taking any antipsychotic for >90 days was associated with HR = 1.8 for falls or fractures.

Stating the Study Objective

Each study should clearly specify:

  • The target population (to which inferences will be made)
  • The source population (from which the study group will be drawn)
  • The unit of observation (individuals or groups)
  • The exposure, the disease, and the follow-up period
  • The setting (context or venue) of interest
  • If biology is known: the amount or duration of exposure thought to cause disease, and the relevant time window for exposure (current vs. lifetime vs. historical)

Key Takeaways

  • Cohort studies follow disease-free subjects from exposure forward to outcome.
  • The design parallels a controlled trial, minus randomisation.
  • Two-cohort designs select by exposure status; longitudinal designs select a single group with a range of exposures.
  • Studies can be prospective or retrospective; the difference is timing relative to outcome occurrence.
  • Closed populations call for risk-based designs; open populations call for rate-based designs.

The takeaways above name what changed conceptually compared with case-control designs. The R box that follows makes the change concrete: because we sampled on exposure, the same kind of 2×2 table you met in Lessons 3 and 4 now yields a risk ratio and an incidence rate ratio directly — no rare-disease assumption required.

R Risk ratio and incidence rate from cohort data

What you'll do: compute risk, incidence rate, the risk ratio, and the incidence rate ratio from a small simulated cohort. What to take away: sampling on exposure unlocks measures of disease frequency that case-control designs simply cannot deliver — and Section 2 will show why the choice between risk-based and rate-based handling determines which of these two ratios is the right summary in any given study.

A cohort lets you compute a risk (cumulative incidence) or a rate (incidence density) directly — you sampled by exposure, not by outcome. Below is a hand calculation of both from a small simulated cohort.

# 1000 exposed and 1000 unexposed individuals followed for up to 5 years.
#   exposed: 80 events in 4500 person-years
# unexposed: 30 events in 4900 person-years

events <- c(exposed = 80,   unexposed = 30)
n      <- c(exposed = 1000, unexposed = 1000)
py     <- c(exposed = 4500, unexposed = 4900)

risk <- events / n
rate <- events / py * 1000          # per 1000 person-years

RR   <- risk["exposed"] / risk["unexposed"]   # risk ratio
IRR  <- rate["exposed"] / rate["unexposed"]   # incidence rate ratio

round(data.frame(risk, rate, RR = RR, IRR = IRR), 3)
Console output
risk rate RR IRR exposed 0.080 17.778 2.667 2.904 unexposed 0.030 6.122 2.667 2.904

Why both? The risk ratio answers "how many times more likely is an exposed person to develop disease over the follow-up window?" The rate ratio answers "per unit of person-time, how much more frequent is the event in exposed people?" In an open cohort with variable follow-up, the rate-based answer is usually the right one.

R Reflect on what you just ran

Use the questions below to interpret the output you produced. Look at your console table before answering.

1. The risk in the exposed group was 0.080 (80/1000) and in the unexposed group 0.030 (30/1000), giving RR = 2.667. Translate that risk ratio into a plain-English sentence about how the cohort's 5-year cumulative incidence differs by exposure.

Model answerOver 5 years of follow-up, 8 per 100 exposed people developed the outcome versus 3 per 100 unexposed — so exposed individuals had 2.67 times the cumulative risk of the outcome compared with unexposed. In absolute terms, the risk difference is 5 per 100 (50 per 1000) attributable to the exposure over 5 years — a meaningful effect size whether you report it as a ratio or a difference.

2. The incidence rate ratio (IRR = 2.904) is slightly larger than the risk ratio (RR = 2.667). Looking at the person-time denominators (4500 vs 4900), why does dividing by person-years instead of headcount nudge the ratio upward? Which group lost more person-time, and what does that suggest about follow-up in this cohort?

Model answerThe exposed group contributed 4500 person-years vs. 4900 for the unexposed — meaning the exposed group lost more time on average, either through earlier events (cases stop accruing person-time at the event) or through earlier loss-to-follow-up / censoring. Because the rate is cases / person-time, a smaller denominator nudges the numerator upward when the case count is similar, so the IRR (2.904) sits a bit above the RR (2.667). The pattern is the canonical sign that events or censoring are unevenly distributed across exposure groups, and is exactly what rates were designed to handle.

3. If the cohort were instead a closed population with everyone followed exactly 5 years (no losses), would the RR and IRR converge? Explain which measure you would report and why.

Model answerYes — in a closed cohort with no losses and the same follow-up window, person-time reduces to (n × time) for everyone disease-free at start, so the denominators of risk and rate move in lockstep and RR = IRR exactly. In that ideal scenario reporting risk ratio is cleaner because cumulative incidence (a probability) is more interpretable than a rate. Once you have differential censoring or open-cohort dynamics, the rate-based (IRR/HR) framework is required because the risk-based estimator becomes biased.
Saved.

The reflection below asks you to use the timing distinction in a concrete research scenario. After working through it and the knowledge check, Section 2 returns to the risk-vs.-rate split that the R box just previewed and shows what each design buys, costs, and assumes.

Reflection

Reflection

Think of a health question you find compelling. Would you address it with a prospective or retrospective cohort design? What records or recruitment infrastructure would you need? What might be lost or gained by each choice in your specific case?

Model answerFor a fast-moving exposure–outcome (e.g., antibiotic prescribing patterns and 30-day Clostridioides difficile infection), a retrospective cohort assembled from administrative data is faster, cheaper, and feasible — the records already exist, the inclusion window can be defined by ICD-coded events, and follow-up is short enough that linkage is reliable. Trade-offs: you inherit the data quality of the chart, miss any exposure not coded, and have no biomarker corroboration. For a slower-moving exposure (e.g., air pollution and dementia), a prospective cohort is the right choice: you can pre-specify the exposure assessment (personal monitors, residential geocoding), measure covariates before the outcome, and avoid recall bias — at the cost of decades of follow-up and study budget. The infrastructure you need is different too: administrative-data prospective cohorts (UK Biobank — Sudlow et al, 2015; the Canadian Longitudinal Study on Aging, Raina et al, 2019) sit between the two.

Minimum 20 characters required.

✓ Reflection saved
Knowledge Check — Section 1

1. The fundamental logic of a cohort study is to:

Correct answer: B. In a cohort study we begin with disease-free subjects classified by exposure, follow them forward in time, and compare the frequency of the outcome (usually disease) between the exposed and non-exposed groups.

2. A cohort study most closely resembles which of the following designs?

Correct answer: C. Grimes and Schulz (2002) describe cohort studies as closely resembling controlled trials — the key difference is that exposure is not randomly assigned. This similarity is often cited as an advantage for causal inference.

3. In a retrospective cohort study, when has the outcome event occurred relative to the start of the study?

Correct answer: A. Retrospective cohort studies use existing records: by the time the investigators begin, the follow-up period has ended and outcome events have already occurred. Prospective designs are the opposite — the outcome has not yet occurred at study initiation.

4. Which of the following best describes a closed source population?

Correct answer: D. A closed population has fixed membership at the start of the study and all subjects are observed for the full risk period. This makes it appropriate for risk-based (cumulative incidence) designs and works best when the risk period is short and losses are few.
Section 2

Risk-Based & Rate-Based Designs

⏱ Estimated reading time: 15 minutes

Introduction and Overview

Section 1 established what a cohort study is and named the design choices its investigators have to make. This section drills into the most consequential of those choices: whether to count events per person (risk) or events per unit of person-time (rate). The two designs share a 2×2 layout but differ in what they assume about the population and what they let you say about disease frequency. Sample-size planning, surprisingly, is a useful place to start, because the calculation is the same for both designs even when the analysis ends up being different.

Learning Objectives

  • Describe the design and assumptions of risk-based (cumulative incidence) cohort studies.
  • Describe the design and analysis of rate-based (incidence density) cohort studies.
  • Identify hypotheses and population types appropriate for each design.
  • Calculate and interpret the basic measures of disease frequency for each design.

Sample Size

Initial sample-size estimates are usually performed assuming an equal number of exposed and non-exposed subjects, and assuming the disease is measured by risk (Section 8.2.2). This approach is often sufficient for initial planning even if the population is open and a rate-based design must ultimately be used.

Modern Sample-Size Software

Recent software allows for unequal sample sizes, repeated measures, multivariable regression models, and proportional hazards models. Specialised methods exist for competing risks (Latouche and Porcher, 2007), survival-time outcomes (Matsui, 2005), strata-matched designs (Mazumdar et al, 2006), and time-varying exposures (Basagana et al, 2011).

Risk-Based (Cumulative Incidence) Designs

This is the simplest form of cohort study, but several assumptions must hold:

  • Exposure groups are defined at the start of the study and remain unchanged (fixed cohorts).
  • The study groups are closed — all subjects must be observed for the full risk period.
  • There should be few or no losses (some authors use >10% losses as a cut-point that casts doubt on validity).

When Risk-Based Designs Work Best

Risk-based designs work best for diseases with a relatively short risk period (e.g., acute infections, post-surgical complications). For chronic diseases such as many cancers, where the risk period is lifelong and often longer than feasible follow-up, a rate-based design is preferred.

2×2 Table: Risk-Based Cohort Design

ExposedNon-exposedTotal
Diseaseda1a0m1
Non-diseasedb1b0m0
Totaln1n0n

We select n1 exposed and n0 non-exposed individuals (free of disease) from the source population, follow them for the full follow-up period, and observe a1 exposed cases and a0 non-exposed cases. The two risks of interest are:

Eq 8.1 R1 = a1 / n1      and      R0 = a0 / n0

The Denominator

In risk-based designs, the denominator is the number of subjects in each exposure category. This is only valid because every subject is observed for the full risk period — otherwise, who you count and who you don’t would depend on follow-up time.

Risk-based designs are conceptually clean but operationally fragile — their assumptions break the moment people leave the cohort or the risk period extends beyond a few months. The rate-based alternative was developed precisely to handle the populations where those assumptions do not hold.

Rate-Based (Incidence Density) Designs

In many cohort studies, not every subject is under observation for the full risk period — especially when:

  • The source population is dynamic (subjects enter and leave).
  • The follow-up period is long.
  • Subjects are added part-way through the biological risk period.
  • A significant proportion of subjects withdraw from the study.
  • Exposure status itself changes during the study.

In these situations, we cannot just count exposed and non-exposed subjects. Instead, we accumulate the amount of ‘at-risk time’ contributed by each subject in each exposure category. The denominator becomes person-time, not persons.

2×2 Table: Rate-Based Cohort Design

ExposedNon-exposedTotal
Diseaseda1a0m1
Person-time at riskt1t0T

Each subject contributes ‘at-risk’ time until they develop the disease, are lost to follow-up, or the study ends. The two rates of interest are:

Eq 8.2 I1 = a1 / t1      and      I0 = a0 / t0

Choice of Analysis

If follow-up is relatively short and rates are reasonably constant, Poisson models are appropriate. If follow-up is long and the assumption of a constant rate is not tenable, survival analysis (e.g., Cox proportional hazards) is preferred (Cox, 1972; see Chapter 19).

You have now seen both designs from the inside. The next subsection puts them side by side; read it as a decision aid for matching design to research situation, not as a statement that one design is generally better than the other.

Comparing the Two Designs

Risk-based designs are best when:

  • The population is closed (fixed cohort).
  • The risk period is short (so all subjects can be observed for the full period).
  • Losses to follow-up are minimal (under ~10%).
  • Examples: acute outbreaks, surgical complications within 30 days, hospital readmissions within 14 days (Example 8.1).

Rate-based designs are best when:

  • The population is open (dynamic).
  • Follow-up is long or the risk period is chronic.
  • Subjects enter or leave the study at different times.
  • Exposure status may change during follow-up.
  • Examples: rugby injury rates over a season (Example 8.6), invasive breast cancer over 10+ years (Example 8.7), fracture incidence over decades (Example 8.8).

Risk denominator: the number of subjects in each exposure category. Counts people.

Rate denominator: the cumulative person-time at risk in each exposure category. Counts time.

This means risk is dimensionless (a proportion between 0 and 1), while a rate has units of cases per person-time (e.g., 4.0 per 1,000 person-years).

Key Examples

The four examples below illustrate the design choice with real published studies. The first two are risk-based; the last two are rate-based. As you expand each, ask yourself why the investigators chose what they chose — in every case the source population's behaviour and the follow-up window's length will be doing most of the work.

Example 8.4 — Risk-Based (Time-of-Day & Surgical Complications)

Kelz et al (2009) compared morbidity and mortality following 56,000+ general and vascular surgical procedures (2001–2004). Time of operation was grouped into seven 2-hour periods. Risk of mortality within 30 days had a moderately strong association with start times after 9:30 pm (OR = 1.22), and morbidity had OR = 1.32 for late-night surgeries. However, when emergency cases were excluded, no odds ratios were significant. The excess crude risk was largely explained by the nature of the clinical cases — an important reminder about confounding by indication.

Example 8.5 — Risk-Based (Cervical Screening in HIV-Positive Women)

Leece et al (2010) followed approximately 250 HIV-positive women receiving care at the Ottawa Hospital General Campus Immunodeficiency Clinic (2002–2005). The outcome was undergoing cervical screening; predictors included demographics, HIV status, and primary care provider status. Analysis combined χ2 tests with logistic regression. The 12 women without a primary-care provider were less likely to undergo screening (RR = 1.6) than the 84 women with providers. The authors noted that abnormal screening results were common and that recent low CD4 cell count was the only significant predictor.

Example 8.6 — Rate-Based (Rugby Injury Rates)

Chalmers et al (2011) followed 704 male amateur rugby players (aged 13+) over a season. The ‘time’ component was a game, with a total of 6,263 player-games of follow-up. Exposures included age, ethnicity, experience, BMI, smoking, previous injury, training, weather, ground conditions, foul play, and protective equipment. Because rates were reasonably constant over the period, Poisson regression was appropriate. Notable findings: Pacific Island vs. Maori ethnicity (IR = 1.5), ≥40 hours of strenuous activity weekly (IR = 1.5), playing while injured (IR = 1.5), foul play (IR = 1.9), and headgear use (IR = 1.2).

Example 8.7 — Rate-Based (Smoking & Breast Cancer)

Luo et al (2011) drew on the Women’s Health Initiative Observational Study: 90,000+ women aged 50–79 followed across 40 US clinical centres. Smoking exposure was characterised in detail (status, age started, age quit, cigarettes/day, pack-years). Over an average of 10.3 years of follow-up, 3,520 incident invasive breast cancers were identified. Because of the long follow-up, Cox proportional hazards models (rather than Poisson) were used. Findings: HR = 1.09 for former smokers and HR = 1.16 for current smokers. Among lifetime non-smokers, only those with the highest passive-smoke exposure had increased risk; no significant dose-response trend was seen.

Key Takeaways

  • Risk-based designs use number of subjects as the denominator and require a closed cohort followed for the full risk period.
  • Rate-based designs use person-time as the denominator and accommodate dynamic populations and variable follow-up.
  • The choice between Poisson and Cox proportional hazards depends on whether the rate is reasonably constant or changes substantially over follow-up.
  • Initial sample-size calculations can be done assuming a risk-based design even when the analysis will ultimately be rate-based.

The reflection below asks you to apply the choice from this section to a specific long-running occupational cohort. After working through it, Section 3 turns to a problem that stays mostly hidden in textbook treatments: how do you actually measure exposure when it can change over years or decades of follow-up?

Reflection

Reflection

Suppose you are studying the effect of a workplace exposure (e.g., shift work) on cardiovascular disease over 20 years. Workers can join or leave the company at any time. Which design (risk-based or rate-based) is more appropriate, and why? What practical issues would arise that wouldn’t arise in a 14-day hospital readmission study?

Model answerRate-based is appropriate. Workers entering and leaving the company across a 20-y window create an open cohort where person-time, not headcount, is the natural denominator. Practical issues: (a) defining exposure time — how to handle workers who move in and out of shift work; (b) healthy-worker effect — long-tenured workers are healthier than the general working population, biasing toward null; (c) healthy-worker survivor effect — those who keep doing shift work are those who tolerate it, dynamically depleting the susceptible from the exposed group; (d) time-varying confounders like BMI or hypertension that may be affected by past shift work; (e) loss-to-follow-up when workers leave the company. None of these arise in a 14-day readmission study because the window is too short for healthy-worker dynamics to matter, the cohort stays effectively closed, and confounder profiles are fixed at index admission. Methods: standardised mortality ratios with internal comparators, g-methods for time-varying exposure, sensitivity for unmeasured occupational confounders.

Minimum 20 characters required.

✓ Reflection saved
Knowledge Check — Section 2

1. Which of the following is a key requirement of a risk-based cohort study?

Correct answer: B. Risk-based designs require a closed cohort with all subjects observed for the full risk period; without this, the risk-as-proportion calculation is not valid. Few or no losses are preferred (some authors use >10% losses as a cut-point indicating doubt about validity).

2. In a rate-based cohort study, the denominator of the rate is:

Correct answer: D. Rate-based designs use person-time as the denominator. Each subject contributes time-at-risk until they develop the disease, are lost to follow-up, or the study ends. This accommodates variable follow-up and dynamic populations.

3. Which study design is most appropriate when the source population is dynamic and follow-up is long?

Correct answer: C. Rate-based (incidence density) designs handle dynamic populations and long follow-up periods because they explicitly account for variable time-at-risk per subject. Risk-based designs require all subjects to be observed for the full risk period.

4. When follow-up is long and the assumption of a constant rate is not tenable, which analysis is preferred?

Correct answer: A. When the constant-rate assumption is suspect, survival models such as Cox proportional hazards are the preferred approach. Poisson works when rates are reasonably stable across the follow-up period (e.g., the rugby season in Example 8.6); Cox is preferred for very long follow-up like the 10-year breast cancer study (Example 8.7).
Section 3

The Exposure

⏱ Estimated reading time: 15 minutes

Introduction and Overview

Sections 1 and 2 set up the architecture of the cohort study and the choice of risk-vs.-rate design. Both treated “exposed” as if it were a fixed property of each subject. In real cohorts that is rarely true. People start smoking, quit, switch jobs, change diets; doses accumulate. This section is about how exposure is actually measured and handled when it can vary across years or decades of follow-up — the technical problem that distinguishes a working cohort study from a textbook one.

Learning Objectives

  • Elaborate the principles used to select and measure exposure in cohort studies.
  • Distinguish between permanent and non-permanent exposures.
  • Describe the concept of an induction period and its role in time-at-risk calculations.
  • Identify how dichotomous, ordinal, and continuous exposure scales differ in measurement and analysis.

Why Exposure Measurement Matters

In cohort studies, the objective is to identify the consequences of a specific exposure factor. Exposures can range from study-subject characteristics (sex, age) to infectious or noxious agents, environmental exposures, or food-related factors. Exposures that can be manipulated are of special interest because they lead more directly to disease control.

Measurement Is Not Trivial

Although measuring exposure might seem simple at first glance, careful thought should be given to how it is measured and expressed. Each study should specify what constitutes exposure and, when possible, the ‘induction period’ — how long after exposure is reached before disease might reasonably arise. The more complex the exposure, the more important it is to validate the assessment.

Scales of Exposure Measurement

Exposure status can be measured on different scales, each with implications for design and analysis. The four flip cards below run from the simplest binary contrast up to compound measures that combine intensity and duration. The pattern to take away: more granular measurement makes the design more powerful for detecting dose-response, but only if the underlying biology really has a graded effect.

Dichotomous
Click to learn more
Ordinal
Click to learn more
Continuous
Click to learn more
Compound
Click to learn more

The choice of measurement scale is independent of a second question: does the exposure stay fixed for the rest of the subject's life, or can it change?

Permanent vs. Non-Permanent Exposures

Permanent exposures are time-invariant — factors such as sex, race, or one-time exposures such as a vaccination. These are relatively easy to measure, but a moment’s thought reveals subtleties:

  • Age at exposure may matter (e.g., age at vaccination, age at smoking initiation).
  • Even ‘one-time’ exposures may have a threshold or dosage requirement to count as ‘exposed’.
  • If the disease event occurs before exposure is completed, it should not be counted as an outcome event — the exposure could not have caused it.

In early studies, the goal may be to determine the threshold at which exposure becomes biologically meaningful (Rohan et al, 2007).

Non-permanent exposures change over time: food intake, lifestyle factors, environmental exposures, or any exposure where the timing matters. These add complexity:

  • When did exposure start? (e.g., age started smoking)
  • When did exposure stop? (e.g., age quit smoking)
  • How intense was exposure? (cigarettes per day)
  • How long did it last? (years smoked)

Sometimes a simple summary will suffice (e.g., years smoked); sometimes a compound measure (e.g., pack-years) is needed. The more information collected, the more credibly causal relationships can be inferred and the more useful the findings are for prevention.

Both permanent and non-permanent exposures share a complication that becomes visible only when you start counting person-time: the gap between when exposure happens and when disease can plausibly result.

The Induction Period & Time-at-Risk

An important concept in cohort design is the induction period — the time after exposure is completed before disease can reasonably arise.

time Exposure occurring Induction period At-risk period Life experience Exposure complete At-risk period begins (induction period over)

Figure 8.1 — Life experience: exposure, induction period, and time-at-risk.

Handling the Induction Period

If there is a known induction period following completion of exposure, then until that period is over, the time-at-risk of ‘exposed’ individuals should be added to the non-exposed group. Some researchers prefer to discard disease experience during the induction period because of uncertainty about its duration; this is often the safest choice provided sufficient time-at-risk remains in the exposed group to maintain precision.

Changing Exposure Status

If exposure status changes during follow-up, an individual subject can contribute time-at-risk to both exposed and non-exposed categories:

  • Previously non-exposed subjects contribute to the exposed category after the exposure threshold is reached.
  • Previously exposed subjects contribute to the non-exposed category after any lag effects have ended.
  • If a subject develops the disease, the exposure category assigned is the one they were in at the time the outcome occurred.

Losses to Follow-Up

For subjects lost to follow-up, time-at-risk accumulates until the last date their exposure status is known. If the precise time of loss is unknown, the midpoint of the last known exposure period is conventionally used.

One last twist on the meaning of “exposure” is worth flagging before we move on, because it shows up surprisingly often in chronic-disease cohorts.

Disease as Exposure

Disease itself can serve as an exposure for other outcomes such as additional diseases, mortality, or quality-of-life measures. Lazo et al (2011) followed 11,000+ adults for 18 years using non-alcoholic fatty-liver disease (NAFLD) as the exposure. Those with NAFLD — whether or not they had elevated liver enzymes — had a similar mortality hazard ratio as those without NAFLD, an interesting null result.

Key Examples

The three published cohorts below illustrate the full range of exposure handling we just covered: a continuous diet exposure validated against multiple recalls, a multi-exposure design with biospecimens, and a multinational study that pushes from estimated relative risk to population-attributable fraction. Expand each one and notice which exposure-measurement decisions shape the rest of the design.

Example 8.8 — Continuous Exposure (Calcium Intake & Fracture)

Warensjo et al (2011) studied women in the Swedish Mammography Cohort. Calcium intake from diet, supplements (1 dose = 500 mg), and multivitamins (1 dose = 120 mg) was the major exposure. Total calcium intake correlated well (r = 0.77) between food-frequency questionnaire and 14 repeated 24-hour recalls — an example of careful exposure validation. Cumulative dietary calcium intake (in quintiles) was related to fracture and osteoporosis incidence using Cox proportional hazards and logistic models. Findings: a chronic, low-dietary calcium intake was associated with increased fracture and osteoporosis. Above the base level, only minor differences were observed; in the highest-intake group, hip fracture rate was somewhat increased.

Example 8.9 — Multiple Exposures (Canadian Diet, Lifestyle & Health Study)

Rohan et al (2007) recruited alumni from three Ontario universities (1995–1998). The major outcome was new cancer incidence. Participants completed lifestyle and food-frequency questionnaires, measured waist and hip circumferences, and provided hair and toenail specimens (for trace element and DNA analysis). Of the 73,000+ recruits, 97% provided biological specimens. Exposures included exercise, lifestyle factors, molecular markers, and dietary characteristics. The paper includes a particularly good discussion of creating compound nutritional variables from food-frequency data and verifying them.

Example 8.10 — Population Attributable Fraction (Alcohol & Cancer)

Schutze et al (2011) reported on the European Prospective Investigation into Cancer and Nutrition (EPIC; Riboli et al, 2002), a multicentre prospective cohort that recruited 520,000 men and women aged 35–70 from 10 European countries (1992–2000). Alcohol consumption was measured at recruitment in grams/day and classified as never, former, or lifetime consumer. Cancer incidence was identified through cancer registries (follow-up ended 2002–2005). The analysis combined regression coefficients with population prevalence of consumption above recommended levels. If causality is assumed, alcohol consumption beyond recommended levels was responsible for an estimated 10% of all cancers in men and 3% of all cancers in women.

Key Takeaways

  • Exposure can be dichotomous, ordinal, continuous, or compound — choose the scale that best captures the biology.
  • Permanent exposures (sex, race, vaccination) are easier but still require careful operational definitions.
  • Non-permanent exposures require timing, intensity, and duration data; the more detail, the more credible the inferences.
  • Time before the induction period ends should not be counted as ‘exposed’ time-at-risk.
  • Subjects can contribute time-at-risk to multiple exposure categories if their status changes.
  • Disease itself can serve as an exposure for downstream outcomes.

The reflection below puts the measurement-scale decision into a real research scenario. After working through it and the knowledge check, Section 4 closes the lesson by addressing the four practical questions that any cohort investigator faces once exposure is settled: keeping the groups comparable, managing the follow-up period, ascertaining outcomes, and analysing the data.

Reflection

Reflection

Imagine designing a cohort study of physical activity and cardiovascular disease. How would you measure activity — dichotomous (active vs. inactive), ordinal (low/moderate/high), or continuous (MET-hours/week)? What might be lost or gained at each level of measurement? Consider how you would handle people whose activity changes substantially during follow-up.

Model answerContinuous (MET-hours/week) is the most informative measurement and the one to default to: it preserves dose-response information, can be analysed flexibly (linear, splines, categorical), and avoids losing power to arbitrary cut-offs. A dichotomous active/inactive split throws away most of the signal and depends on the cut-off you choose; an ordinal three-level scheme is a reasonable compromise for reporting. Whatever you record, store it continuous and categorise only for presentation. People whose activity changes substantially: model it as a time-varying exposure (each interval gets its own MET-hours value), and pre-specify how you'll handle measurement protocols (e.g., re-administered every 2 years). If reverse causation is a worry (people reduce activity because they're getting sick), use a lag (e.g., assign exposure status from 2 years before the outcome window) and run sensitivity analyses for different lag lengths.

Minimum 20 characters required.

✓ Reflection saved
Knowledge Check — Section 3

1. The induction period refers to:

Correct answer: C. The induction period is the time after exposure is completed before disease can reasonably arise from that exposure. During this period, the time-at-risk of ‘exposed’ individuals should be added to the non-exposed group, or this experience may be discarded.

2. Which exposure measurement is the example of a compound variable?

Correct answer: B. Pack-years is a compound variable that combines duration and intensity of exposure. Luo et al (2011) used this measure in their study of smoking and breast cancer (Example 8.7).

3. If a subject’s exposure status changes during the study, how is their time-at-risk handled?

Correct answer: D. When exposure status changes, the subject contributes time-at-risk to both categories. Time before the change goes to the original category; after the change (allowing for any lag effects) goes to the new category. If they develop the disease, they are assigned to whichever category they were in at the time of the outcome.

4. Why is categorising a continuous exposure variable often discouraged in analysis?

Correct answer: A. Categorising a continuous exposure variable usually results in loss of information. When possible, modelling the exposure-outcome relationship on the continuous scale (e.g., dose-response) preserves more of the underlying information.
Section 4

Comparability, Follow-up, Outcomes & Analysis

⏱ Estimated reading time: 18 minutes

Introduction and Overview

Sections 1–3 walked through the design choices: cohort architecture, risk-vs.-rate handling, and exposure measurement. With those locked in, the remaining work is operational. Four practical questions structure this section in turn: how do you make sure the exposed and non-exposed groups are comparable on the variables that aren't your exposure of interest, how long should you follow them, how do you confirm the outcome happened, and how do you analyse what you collect?

Learning Objectives

  • Identify the three approaches to ensuring comparability of exposed and non-exposed groups.
  • Describe principles for unbiased follow-up and outcome ascertainment.
  • Recognise appropriate analytic approaches for risk-based and rate-based cohort designs.
  • Apply STROBE reporting guidelines to a cohort study.

Ensuring Exposed & Non-Exposed Groups Are Comparable

If exposed and non-exposed groups are not comparable (i.e., not exchangeable) with respect to factors related to both exposure and outcome, a biased (confounded) assessment results (Klein-Geltink et al, 2007).

The Achilles Heel of Observational Studies

As Hernan (2012) notes, this is a key reason to prefer randomised experiments — exchangeability is expected when exposure is randomised. In observational studies, investigators must use expert knowledge to identify and measure all potential confounders in hopes of achieving exchangeability conditional on the measured covariates. Unfortunately, exchangeability cannot be empirically tested, so we never know with certainty whether we have succeeded.

Three approaches help ensure comparability:

Restriction
Click to learn more
Matching
Click to learn more
Analytic Control
Click to learn more

Comparability addresses the cross-sectional baseline. The next question is what happens over time — specifically, how long the follow-up should last and what counts as time-at-risk for each subject.

Follow-Up Period

To enhance validity, the follow-up process must be as complete as possible and unbiased with respect to exposure. Achieving unbiased follow-up may require some form of blinding to exposure status:

  • In prospective studies: assign follow-up tasks to researchers unaware of exposure status.
  • In retrospective studies: keep record reviewers unaware of exposure status when possible.

Active vs. Passive Surveillance

With passive surveillance, cases are identified when they present (e.g., date of first symptoms or physician examination). With active surveillance and regular evaluation of subjects, more accurate timing of outcome occurrence is feasible. Tooth et al (2005) recommend enumerating the at-risk population at specified times during the study period.

Losses to follow-up are a perennial concern. Chang et al (2009) describe shared parameter models to reduce bias when losses are not randomly distributed.

Even careful follow-up only matters if the outcome itself is measured well. Cohort studies tend to be ambitious about exposure characterization but surprisingly casual about outcome ascertainment, and the next subsection is the corrective.

Measuring the Outcome

Although the most frequent outcome in cohort studies is the occurrence of a specific disease (measured as risk or rate), outcomes can also be:

  • Ordinal: e.g., none, mild, moderate, severe.
  • Continuous: e.g., birth weight, blood pressure, BMI, quality of life index (Example 8.2).

Harley et al (2011), for instance, examined polybrominated diphenyl ethers (PBDEs — a flame retardant) and infant birth weight, length, and head circumference — both exposure and outcome on continuous scales.

Diagnostic Criteria

Each study needs explicit protocols for determining outcome occurrence and timing. Clear diagnostic criteria minimise diagnostic errors. In retrospective studies, this can be challenging when only summary diagnostic information is available.

Incidence Requires Two Examinations

Strictly, measuring incidence requires:

  1. An examination at the start of follow-up to ensure subjects do not yet have the disease.
  2. A second examination to determine whether (and when) the disease developed.

Why Incidence, Not Prevalence?

Including only new disease events avoids the reverse-causation problem from measuring prevalence and ensures that associations are not biased by duration-of-disease effects and survival bias (see Chapter 12). In retrospective studies, freedom from disease at the start of follow-up often must be assumed; in prospective studies, it should be formally verified.

If clinical diagnostic data are used, the incident date is usually based on the time of diagnosis, not the time of underlying disease occurrence — an important caveat for diseases with long subclinical phases. If subjects are screened at regular intervals, the time of disease occurrence is conventionally placed at the midpoint between examinations.

Multiple Outcomes

The Multiple-Comparisons Problem

One advantage of cohort studies is the ability to assess multiple outcomes from a given exposure. However, with multiple outcomes, some may be statistically significant by chance alone. The remedy is either to consider the study as hypothesis-generating rather than hypothesis-testing, or to apply a penalty to the P-value threshold — unless outcomes were specified a priori.

Comparability, follow-up, and outcome ascertainment are all design-stage decisions. The analysis stage is where those decisions translate into a quantitative result — and where pre-specified analysis plans (the EGAP-style discipline from Lesson 1) earn their keep, because cohort data tempt almost limitless slicing.

Analysis

For closed populations, average risk and survival times can be measured during follow-up.

  • Bivariable: methods in Chapter 6.
  • Stratified analysis to control confounding: Chapter 13.
  • Multivariable: traditionally logistic regression, which uses odds ratios as the base measure.
  • For estimating risk ratios directly in multivariable settings: see Section 18.4.1.
  • For risk differences as the association measure: linear regression as described by Cheung (2007).
  • For population attributable fractions: log-linear models (Cox, 2006) or various models including logistic, log-linear, and Poisson (Greenland, 2004; Example 8.10).

For open populations, rates measure disease frequency. The choice depends on follow-up length:

  • If the rate is reasonably constant over follow-up: Poisson regression is appropriate. Subject time-at-risk is included as the offset.
  • If the rate varies substantially over follow-up: Cox proportional hazards models are preferred (most rate-based cohort analyses in the medical literature use these).
  • For grouped data with multivariable analysis: Poisson with offsets gives direct incidence rate ratio estimates.
  • If time of occurrence matters more than just whether the outcome occurred: survival models (Case et al, 2002).

Callas et al (1998) compared proportional hazards, Poisson, and logistic models, concluding that the first two are preferable to logistic for cohort data — a finding confirmed by Greenland (2004).

Hernan (2010) describes two drawbacks to hazard ratios (HRs):

  1. Average HRs can change over time. The reported average HR depends on the duration of follow-up.
  2. Period-specific HRs have a built-in bias. The HR at time t is conditional on not having developed the outcome before time t. As follow-up lengthens, susceptible people in the exposed group are progressively depleted, so the apparent risk in the exposed group decreases relative to the non-exposed group.

Hernan describes how to circumvent these problems using covariate-adjusted survival curves. Hernan et al (2008) propose subdividing follow-up into shorter intervals and treating each as a ‘trial’ — an analytic strategy that more closely approximates a randomised trial. Danaei et al (2011) elaborate this approach.

Time-Dependent Confounders

Gran et al (2010) describe a sequential Cox technique for data with time-dependent confounders — covariates that are affected by past exposure and predict future exposure and outcome (e.g., CD4 count when assessing HIV treatment effects).

Repeated Measurements

Xue et al (2010) explain how marginal and mixed-effect models can handle cohort data with repeated measurements of exposure and covariates, contrasting these with logistic and proportional hazards approaches.

The last operational concern, as in Lessons 3 and 4, is reporting. STROBE returns one more time, now with the items that matter specifically for cohort designs.

Reporting of Cohort Studies (STROBE)

The STROBE statement (von Elm et al, 2007) provides reporting guidelines for observational studies. Tooth et al (2005) elaborate criteria specific to cohort studies. These should be used both to plan and report studies, and to assess the validity of published work.

Table 8.1 — Selected Criteria for Assessing Cohort Studies

Study definition:

  • Are objectives or hypotheses stated?
  • Are the target population, sampling frame, and study population defined?
  • Are the setting, geographic location, and dates stated?
  • Are eligibility criteria stated?
  • Is the number of participants justified?

Recruitment & participation:

  • Are numbers meeting and not meeting eligibility criteria stated?
  • Are reasons for ineligibility and refusal given?
  • Were responders compared with non-responders?

Measurement:

  • Are methods of data collection stated?
  • Was the reliability and validity of measurement methods mentioned?
  • Were any confounders mentioned?

Follow-up & analysis:

  • Was the number of participants at each stage specified?
  • Were reasons for loss to follow-up quantified?
  • Was the type of analysis stated, including longitudinal methods?
  • Were absolute and relative effect sizes reported?
  • Were confounders and missing data accounted for in analyses?

Discussion:

  • Was the impact of biases assessed (qualitatively or quantitatively)?
  • Did authors relate results to a target population?
  • Was generalisability discussed?

Reporting Study Designs Before Results

Tooth et al (2005) and others note an increased frequency of reporting on the design and implementation of proposed or early-stage studies (e.g., Gern et al, 2009; Hermsen et al, 2011; Origasa et al, 2011; Poulos et al, 2011; Schuz et al, 2011). This is a positive trend — it allows assessment of strengths and weaknesses without the bias that comes from already knowing the results.

Key Takeaways

  • Three tools for comparability: restriction (before selection), matching (at selection), analytic control (during analysis).
  • Exchangeability cannot be empirically tested in observational studies — we always work in some uncertainty.
  • Blinded follow-up reduces bias; active surveillance gives more accurate event timing than passive.
  • Always measure incidence (not prevalence) to avoid reverse-causation, duration, and survival biases.
  • Risk-based studies are typically analysed with logistic or log-linear regression; rate-based studies use Poisson or Cox proportional hazards.
  • Hernan (2010) cautions about hazard ratios changing over time and being conditional on prior survival.
  • STROBE provides a checklist for both planning and assessing cohort studies.

The reflection below is the section's exit ticket — a realistic review-the-paper prompt that requires you to use everything from this section. After working through it and the knowledge check, the final assessment integrates the four content sections into one 15-question exam.

Reflection

Reflection

You are reviewing a published cohort study reporting that a dietary supplement reduced cardiovascular events with HR = 0.75. Drawing on this section, what specific methodological aspects would you scrutinise to decide whether you trust this result? Consider comparability of groups, follow-up completeness, outcome ascertainment, choice of analysis, and STROBE reporting.

Model answerHR = 0.75 from an observational cohort needs four scrutiny points. (a) Comparability of groups: was the analysis a new-user / active-comparator design with adjustment for indication, or a prevalent-user design that conflates initiation with healthy adherence? (b) Follow-up completeness: what was the rate of loss-to-follow-up and was it differential by exposure? Substantial differential loss would bias the HR (typically toward stronger apparent benefit if dropouts are non-responders). (c) Outcome ascertainment: validated endpoints adjudicated blind to exposure, or self-report? Misclassification at the outcome end usually dilutes effects, so an HR away from null with poor measurement is suspicious. (d) Analytic choice: was a Cox model with proportional hazards checked, and were the covariates time-fixed or time-varying? (e) STROBE reporting: pre-specified protocol, registry entry, full reporting of attrition, sensitivity analyses, and confounder lists — if any are missing the result is provisional.

Minimum 20 characters required.

✓ Reflection saved
Knowledge Check — Section 4

1. Which approach to ensuring comparability is applied before subject selection?

Correct answer: B. Restriction (also called restricted sampling or exclusion) is applied before subject selection — we limit the study to subjects with one level of the confounder. Matching is applied at selection; analytic control is applied during analysis.

2. Why are incident cases preferred over prevalent cases for measuring outcomes?

Correct answer: D. Measuring incidence (new events) instead of prevalence avoids the reverse-causation problem and ensures that associations are not biased by duration-of-disease effects and survival bias. Prevalent cases include only those who have lived long enough with the disease to be counted, which distorts the picture.

3. According to Hernan (2010), which is a documented ‘hazard’ of the hazard ratio?

Correct answer: C. Hernan (2010) describes two drawbacks: (1) the average HR is dependent on the duration of follow-up, and (2) period-specific HRs are conditional on the subject not having developed the outcome before time t, which introduces a built-in bias as susceptible people in the exposed group are depleted.

4. According to STROBE-style criteria, which of the following should be reported in a cohort study?

Correct answer: A. STROBE-style criteria (Tooth et al, 2005) emphasise comprehensive reporting: reasons for loss to follow-up, methods of data collection, reliability and validity of measurements, how confounders and missing data were accounted for, and discussion of biases and generalisability.
Section 5

Final Review & Assessment

⏱ Estimated time: 20 minutes

Bringing It All Together

Where Lesson 4 looked backward from outcome to exposure, this lesson followed the arrow forward. Section 1 framed the cohort study as something close to a controlled trial without randomisation: pick a source population, classify exposure, follow people, and watch incidence accumulate. Section 2 then forced the central design choice — risk-based (cumulative incidence) for closed populations followed for a fixed time, or rate-based (incidence density) for open populations where person-time is the natural denominator — and connected each to its analytic toolkit (binomial / log-binomial vs. Poisson / survival).

Sections 3 and 4 zoomed in on the operational details that decide whether a cohort study survives appraisal. How exposure is scaled (dichotomous, ordinal, continuous, compound), whether it changes over time, how the induction period is handled, and how comparability is engineered through restriction, matching, and analytic control all determine whether the rate ratio you eventually report is estimating what you think it is. Blinded outcome ascertainment, careful handling of loss to follow-up, and STROBE-aligned reporting then turn a defensible design into a study other researchers can actually use.

The final reflection asks you to put the entire arc to work as a brief cohort proposal of your own; the 15-question assessment then checks the conceptual content directly. Lesson 6 will pull the camera back further to look at ecological and group-level designs — where the unit of analysis stops being the person — and the comparability and inference logic you just built will keep paying off there.

Key Takeaways from Lesson 5

  • Cohort studies follow exposed and unexposed people forward in time, giving you direct access to incidence — something case-control studies cannot deliver.
  • The choice between risk-based (closed cohort, cumulative incidence) and rate-based (open cohort, incidence density) designs is dictated by the source population and the follow-up structure, not by preference.
  • Person-time is the right denominator whenever follow-up is uneven or membership is dynamic; it makes Poisson and survival analyses possible.
  • Exposure must be measured on a meaningful scale, with explicit handling of induction periods and time-varying status — misclassification here flows through the whole analysis.
  • Comparability is engineered through restriction, matching, and analytic control; blinded outcome ascertainment and rigorous tracking of loss-to-follow-up protect internal validity.
  • STROBE-aligned reporting closes the loop: a cohort study is only as useful as its design, conduct, and analysis are visible to the reader.
R Activity — Risk, rate, RR/IRR, and a 95% CI for the rate ratio

The companion R script r-activities/HSCI_230_Lesson_5_Cohort_Studies.R walks through a small simulated cohort end-to-end: compute cumulative incidence (risk) and incidence rate (per 1000 person-years) for exposed vs. unexposed groups, derive the risk ratio (RR) and incidence rate ratio (IRR), then build a Wald 95% confidence interval for the IRR on the log scale — the same workflow you will reach for when appraising a published cohort.

# 1000 exposed and 1000 unexposed individuals followed for up to 5 years.
#   exposed:  80 events in 4500 person-years
# unexposed:  30 events in 4900 person-years

events <- c(exposed = 80,   unexposed = 30)
n      <- c(exposed = 1000, unexposed = 1000)
py     <- c(exposed = 4500, unexposed = 4900)

risk <- events / n
rate <- events / py * 1000          # per 1000 person-years

RR  <- risk["exposed"] / risk["unexposed"]   # risk ratio
IRR <- rate["exposed"] / rate["unexposed"]   # incidence rate ratio

round(data.frame(risk, rate, RR = RR, IRR = IRR), 3)

## -----------------------------------------------------------------------------
## Stretch: 95% CI for the rate ratio (Wald approximation on the log scale)
## -----------------------------------------------------------------------------
log_irr   <- log(IRR)
se_logirr <- sqrt(1/events["exposed"] + 1/events["unexposed"])
ci_irr    <- exp(log_irr + c(-1, 1) * 1.96 * se_logirr)
round(c(IRR = IRR, lower = ci_irr[1], upper = ci_irr[2]), 3)

Final Reflection

Design a brief cohort study proposal for a health question of your choice. Specify: (1) the research question and hypothesis, (2) whether the source population is open or closed, (3) whether you would use a risk-based or rate-based design and why, (4) how you would define and measure exposure (and on what scale), (5) how you would ensure comparability of groups, and (6) what analytic approach you would use.

Model answer(1) Question/hypothesis: Among adults 18–45, does sustained intake of ultra-processed food (NOVA-4) > 30% of total energy increase 10-year incidence of metabolic syndrome? (2) Source population: open cohort — community-recruited Vancouver-area adults; movers can be retained. (3) Design: rate-based with risk-set methods (Cox PH), to use person-time and handle censoring/loss cleanly. (4) Exposure measurement: 24-h dietary recalls (3 per year) coded under NOVA, then collapsed to %energy from NOVA-4; analysed continuously with restricted cubic splines, plus a pre-specified clinical threshold at 30%. (5) Comparability: baseline equivalence on income, education, ethnicity, physical activity, family history; restriction to participants without metabolic syndrome at baseline; DAG-guided adjustment for SES and physical activity (NOT for BMI, which is a mediator). (6) Analysis: Cox PH with the continuous exposure, multiple imputation for missing covariates, sensitivity analyses lagging exposure 2 years to address reverse causation, and pre-registration on OSF.

Minimum 20 characters required.

✓ Reflection saved

Final Assessment

This assessment covers all sections of this lesson. You must score 100% to complete the lesson. Review the feedback after each attempt.

Final Assessment — Cohort Studies (15 Questions)

1. The fundamental logic of a cohort study is to:

Correct answer: B. The cohort design follows disease-free subjects classified by exposure forward in time and compares disease frequency between exposed and non-exposed groups (Grimes and Schulz, 2002).

2. A cohort study most resembles which other design?

Correct answer: C. Cohort studies closely resemble controlled trials except that exposure is not randomly assigned. This similarity is often cited as an advantage for causal inference.

3. In a closed source population, all subjects:

Correct answer: A. A closed (fixed) cohort requires that all subjects be observable for the full risk period. This is the assumption that makes risk-based (cumulative incidence) designs valid.

4. In a risk-based cohort study, what is the denominator of the risk?

Correct answer: D. In risk-based designs, R1 = a1/n1 and R0 = a0/n0 (Eq 8.1). The denominator is the number of subjects, not person-time. This is only valid because every subject is observable for the full risk period.

5. In a rate-based cohort study, the denominator is:

Correct answer: B. Rate-based designs accumulate person-time at risk: I1 = a1/t1 and I0 = a0/t0 (Eq 8.2). Each subject contributes time-at-risk until they develop the disease, are lost, or the study ends.

6. Which design is best suited for studying a chronic disease with a long, lifelong risk period?

Correct answer: A. For chronic diseases like many cancers, where the risk period is lifelong and often longer than feasible follow-up, a rate-based design is preferred. Risk-based designs require all subjects to be observable for the full risk period.

7. Which is an example of a compound exposure variable?

Correct answer: D. Pack-years is a compound variable that combines duration and intensity of cigarette exposure into a single cumulative-dose measure. Luo et al (2011) used this in their breast cancer study (Example 8.7).

8. The induction period is:

Correct answer: B. The induction period is the time after exposure is completed before disease might reasonably arise. During the induction period, time-at-risk of exposed individuals should be assigned to the non-exposed group, or that experience may be discarded altogether.

9. If a subject’s exposure status changes during follow-up:

Correct answer: C. When exposure status changes, time-at-risk before the change is assigned to one category and time after the change (allowing for any lag) to the other. If they develop the disease, they are assigned to the category they were in at the time the outcome occurred.

10. Which approach to comparability is applied during analysis rather than design?

Correct answer: A. Analytic control identifies and measures confounders, then uses statistical control (Mantel-Haenszel stratification through to multivariable regression) during analysis. Restriction is applied before subject selection; matching is applied at the time of selection.

11. Which statement about exchangeability in observational studies is correct?

Correct answer: B. As Hernan (2012) notes, exchangeability cannot be empirically tested in observational studies, so we never know with certainty whether we have achieved it. This is a key reason to prefer randomised experiments when feasible.

12. Why are incident cases preferred over prevalent cases for cohort outcomes?

Correct answer: D. Including only new disease events (incidence) circumvents the reverse-causation problem from measuring prevalence and ensures that associations are not biased by duration-of-disease effects and survival bias.

13. For a rate-based cohort study where rates can be assumed reasonably constant, which model is appropriate?

Correct answer: A. Poisson regression with person-time as the offset is appropriate when rates can be assumed reasonably constant over follow-up. The Poisson coefficients give direct estimates of the incidence rate ratio. When constant rates are not tenable (e.g., long follow-up), Cox proportional hazards models are preferred.

14. Hernan (2010) describes which problem with the average hazard ratio?

Correct answer: C. Hernan (2010) points out that the average HR depends on the duration of follow-up. Period-specific HRs are also conditional on the subject not developing the outcome before time t, introducing a built-in bias as susceptible people in the exposed group are progressively depleted.

15. According to STROBE-style criteria for cohort studies, which item should be reported?

Correct answer: B. STROBE-style criteria (Tooth et al, 2005) require comprehensive reporting including numbers of participants at each stage, reasons for loss to follow-up, methods of data collection, validity of measurements, and how confounders were accounted for.

🎉 Congratulations!

You have completed Lesson 5: Cohort Studies.

You now understand the design, implementation, analysis, and reporting of cohort studies, including risk-based and rate-based designs, exposure measurement principles, comparability strategies, and STROBE reporting guidelines.

Lessons 3–5 covered the three workhorse observational designs at the level of the individual: cross-sectional, case-control, and cohort. Lesson 6 changes the unit of analysis. Ecological and Group-Level Studies uses populations rather than individuals as the unit of observation — a strategy that opens up routinely-collected data for epidemiology but introduces a famous interpretive trap (the ecological fallacy) you will need to recognize.