Validity in Observational Studies

Fundamental Epidemiological Concepts and Approaches

Learning objectives for this lesson:

Identify different types of selection bias and assess whether a study is likely to suffer from it
Determine the likely direction and magnitude of selection bias using sampling fractions or sampling odds
Apply principles of bias prevention in study design, including secondary-base studies
Explain differences between non-differential and differential misclassification bias in terms of sensitivity and specificity
Evaluate misclassification of exposure, disease, or both in 2×2 tables
Evaluate the likely impact of misclassification using sensitivity analysis
Apply validation studies and regression calibration to adjust observed data
Modify sample-size estimates to account for misclassification

This course was developed by Dr. Kiffer G. Card, Faculty of Health Sciences, Simon Fraser University based on Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.

Reference

Glossary: Key Terms, People & Concepts

📚 Reference page, available throughout the lesson

This glossary collects the key concepts, people, and ideas you will meet in this lesson. Use it as a reference while you work through the material, or as a review before assessments. Type in the search box to filter entries.

Validity Concepts

Validity The extent to which a study's inferences correspond to truth. Subdivided into internal validity (correct inference about the study sample) and external validity (correct inference about a target population).

Internal Validity The degree to which the observed association reflects the true causal relationship in the study sample, free of bias and confounding.

External Validity (Generalisability) The extent to which study findings apply to populations, settings, or conditions beyond the study sample. Sometimes called transportability when applied to specific new populations.

Bias A systematic error in the design, conduct, or analysis of a study that produces an estimate that systematically differs from the truth. Distinguished from random error (chance).

Random Error (Imprecision) Variation in an estimate due to chance sampling fluctuations. Reduced by larger samples; quantified by confidence intervals.

Selection Bias

Selection Bias Systematic differences between the study sample and the source population produced by the way participants are selected, retained, or analysed. Distorts the exposure-outcome association.

Berkson's Bias (Admission-Rate Bias) A selection bias in hospital-based case-control studies in which differential admission probabilities for cases and controls produce a spurious exposure-disease association.

Healthy Worker Effect Workers tend to be healthier than the general population because the unwell are filtered out of employment. Mortality and morbidity rates in occupational cohorts can therefore underestimate true effects of exposure.

Loss to Follow-Up (Attrition) Bias Bias arising when participants who drop out differ systematically from those who remain on the joint distribution of exposure and outcome.

Self-Selection (Volunteer) Bias Bias from the tendency of those who volunteer or respond to participate differently than non-respondents on relevant characteristics.

Neyman (Incidence-Prevalence) Bias In prevalent case-control studies, fatal or short-duration cases are underrepresented; the resulting sample over-represents survivors and milder forms of disease.

Information Bias

Information (Measurement) Bias Bias arising from systematic error in measuring exposure, outcome, or covariates. Encompasses recall, interviewer, and surveillance biases as well as misclassification.

Misclassification Errors in assigning subjects to exposure or outcome categories. Non-differential misclassification (random across groups) typically biases toward the null; differential misclassification can bias in either direction.

Recall Bias A differential information bias in which cases recall past exposures more (or less) thoroughly than controls. Common in case-control studies of birth defects, injuries, or stigmatised behaviours.

Interviewer Bias Systematic differences in how interviewers solicit, record, or interpret information by case-control status. Mitigated by blinding interviewers to status.

Surveillance (Detection) Bias When exposed individuals are screened, tested, or examined more intensively than the unexposed, producing apparent excess outcome detection independent of true incidence.

Ecological Fallacy Drawing individual-level conclusions from group-level data. Aggregate associations need not hold within individuals because of within-group heterogeneity.

Methods, Measures & Diagnostics

Sensitivity & Specificity Sensitivity: probability that a true case tests positive. Specificity: probability that a true non-case tests negative. Together they characterise the misclassification produced by an imperfect measurement instrument.

Cohen's Kappa A chance-corrected measure of agreement between two raters or instruments. Used to quantify reliability of exposure or outcome ascertainment.

Quantitative Bias Analysis A formal sensitivity-analysis framework that uses bias parameters (e.g., misclassification rates, selection probabilities, unmeasured-confounder strengths) to estimate the magnitude and direction of bias in observed estimates.

E-Value The minimum strength of association (on the risk-ratio scale) that an unmeasured confounder would need to have with both exposure and outcome to fully explain away the observed association. Higher E-values imply more robustness.

Triangulation Strengthening causal inference by combining results from study designs whose biases differ in direction or source (e.g., RCTs, cohort, Mendelian randomisation, negative-control analyses).

Negative Control An exposure or outcome chosen to share the same suspected biases as the primary analysis but that should plausibly have no causal relationship with it. A non-null association implies residual bias.

Sensitivity Analysis Re-running an analysis under alternative assumptions about missing data, model specification, or unmeasured confounding to assess how robust conclusions are to those assumptions.

No matching entries. Try a different search term.

Section 1 of 5

Introduction & Selection Bias

⏱ Estimated reading time: 20 minutes

Lesson 11 · HSCI 341

Validity in Observational Studies

Careful design and analysis are necessary. Validity asks whether the observed association actually reflects the true one.

Core definitions

Internal validity, external validity, generalisability

Internal validity

The study allows unbiased inferences about associations in the source population.

External validity

The study allows correct inferences to populations beyond the source population.

Generalisability

Extending valid theories to broadly defined populations, potentially across settings or species.

DAG representation

How selection bias enters: two scenarios

Conditioning on selection creates a spurious E-D association even when none exists in the source population.

Sampling fractions

Quantifying selection bias

Eq 11.1: Sampling Fractions

\[ sf_{11} = \frac{\color{#C2410C}{a_1}}{\color{#C2410C}{A_1}},\quad sf_{12} = \frac{\color{#1D4ED8}{a_0}}{\color{#1D4ED8}{A_0}},\quad sf_{21} = \frac{\color{#C2410C}{b_1}}{\color{#C2410C}{B_1}},\quad sf_{22} = \frac{\color{#1D4ED8}{b_0}}{\color{#1D4ED8}{B_0}} \]

lowercase count in the studyUPPERCASE count in the source populationsubscript 1 exposedsubscript 0 unexposed

If all four fractions are equal, there is no selection bias. They can be unequal and still produce no bias in the observed OR, provided:

No-bias condition

\[ \color{#0B7B6B}{OR_{sf}} = \frac{\color{#C2410C}{sf_{11} / sf_{21}}}{\color{#1D4ED8}{sf_{12} / sf_{22}}} = 1 \]

OR_sf odds ratio of the sampling fractionsnumerator selection odds in the exposeddenominator selection odds in the unexposed

If \(OR_{sf} > 1\), bias is away from the null; if \(OR_{sf} < 1\), bias is toward the null.

Sampling odds

A parallel formulation

Eq 11.2: Sampling Odds

\[ \color{#C2410C}{so_{D+|E+}} = \frac{sf_{11}}{sf_{21}},\qquad \color{#1D4ED8}{so_{D+|E-}} = \frac{sf_{12}}{sf_{22}} \]

so_D+|E+ selection odds of disease, exposedso_D+|E− selection odds of disease, unexposed

If \(so_{D+|E+} = so_{D+|E-}\), there is no selection bias.

Example 11.1 in brief

Non-response related to exposure only: observed RR 2.04 vs. true RR 2.08, negligible bias. Non-response related to both exposure and outcome: observed RR 1.73 vs. true RR 2.04, biased toward the null; OR_sf = 0.8.

Carry forward

Next: the named patterns

Validity = absence of systematic bias; internal concerns the source population, external the target.
Selection bias arises from conditioning on a collider linking exposure and disease through participation.
Sampling fractions quantify the bias: if \(OR_{sf} = 1\), no bias in the observed odds ratio.

Introduction and Overview

Earlier lessons covered how to design, conduct, analyse, and synthesize epidemiologic studies. This lesson turns to validity, the property that determines whether all that work produces a defensible answer. The premise that well-conducted observational studies can deliver effect estimates comparable to randomised trials when bias is properly addressed is the long-standing claim of Concato, Shah, & Horwitz (2000); the equally enduring counter-claim that most published findings are false unless biases are explicitly quantified comes from Ioannidis (2005). The four content sections walk through the bias inventory you first met in an earlier course, now consolidated for this course as a systematic appraisal framework. This section defines validity and introduces selection bias; a later section works through canonical examples and reduction strategies for selection bias; a later section covers information bias and misclassification; a later section closes with validation studies, measurement error, and how information bias affects required sample size. By the end, you should be able to read any observational study and decide where its validity is at risk.

Learning Objectives

Define validity in epidemiologic research and distinguish internal validity, external validity, and generalisability.
Name the three major bias categories (selection, information, confounding) and locate where each enters the study process.
Define selection bias and explain how it differs from random sampling error.
Identify the design features (single-cohort design, comparable groups, source-population matching) that reduce selection bias at the planning stage.

11.1 Introduction to Validity

An awareness of the key features of study design, implementation, and analysis should help ensure we obtain valid results from research. The term validity relates to the absence of a systematic bias in results, that is, a valid measure of association in the study group will have the same value as the true measure in the source population (except for variation due to sampling error).

To the extent that the study group and source population measures differ systematically, the result is said to be biased.

Key Concept: Internal vs. External Validity

Internal validity means the study allows unbiased inferences about associations in the source population.

External validity relates to making correct inferences to populations beyond the source population (the target population).

Generalisability is an inferential step beyond external validity, extending valid scientific theories to broadly defined populations (e.g., across populations and/or species).

The Three Major Types of Bias

Click each card to learn more about the three major categories of bias that threaten the validity of observational studies:

Selection BiasClick to explore

Information BiasClick to explore

Confounding BiasClick to explore

11.2 Selection Bias

Selection bias results from the fact that the composition of the study group differs from that in the source population, and this biases the association observed between the exposure and the outcome of interest. Selection bias can affect study results significantly.

From a sampling and study-design perspective, each study will have an objective that relates to a defined target population. Ideally, the study group would completely reflect the source population, which in turn would reflect the target population. In practice, this is rarely the case.

Bias Variables and DAGs

Bias variables influence participation in a study in a way that causes the initial or final composition of the study group to differ from the source population, thus biasing the observed association.

The basic conditions for selection bias can be shown using directed acyclic graphs (DAGs):

In Scenario 1, both E and D independently affect selection. When we condition on selection (study only the responders), E and D become associated even though they are independent in the source population. In Scenario 2, a bias variable related to both exposure and disease directly affects selection, creating a spurious association. This unified causal-graph framing of selection bias as conditioning on a common effect was crystallised by Hernán, Hernández-Díaz, & Robins (2004); Greenland (2003) had earlier quantified how collider-stratification bias compares in magnitude with classical confounding, and Greenland, Pearl, & Robins (1999) provide the foundational DAG primer for epidemiologic use.

Sampling Fractions & Sampling Odds

We can understand selection bias by examining sampling fractions. The source population and study group follow the structure shown below:

Source Population	E+	E−
D+	A₁	A₀	M₁
D−	B₁	B₀	M₀
	N₁	N₀	N

The four sampling fractions (sf) represent the proportion selected from each cell:

Eq 11.1: Sampling fractions

sf₁₁ = a₁/A₁ sf₁₂ = a₀/A₀
sf₂₁ = b₁/B₁ sf₂₂ = b₀/B₀

Each sampling fraction is the share of a source-population cell that actually entered the study (study count over source count). Under random sampling all four are equal and there is no selection bias.

If subjects were selected by random sampling, all four fractions would be equal, indicating no selection bias. If the sampling fractions are equal, the OR of the sampling fractions (OR_sf) equals 1, and there is no bias in the observed OR.

Key Insight

The four sampling fractions can be unequal and still produce no bias in the observed OR, provided OR_sf = 1. Also, if OR_sf = 1, there is no bias to the risk ratio (RR) if disease is infrequent.

Sampling Odds

In practice, sampling odds may be easier to conceptualise than individual sampling fractions. For a cohort study, we compare the sampling odds of disease among exposed versus non-exposed subjects:

Eq 11.2: Sampling odds

so_D+|E+ = sf₁₁ / sf₂₁
so_D+|E− = sf₁₂ / sf₂₂

The sampling odds compare how cases and non-cases were sampled within the exposed and the unexposed groups. Equal selection odds mean no bias; a ratio above 1 biases away from the null, below 1 toward it.

If these selection odds are equal, there is no bias. Here the null means the no-association value that a ratio measure takes when exposure and outcome are unrelated, namely 1, so bias toward the null makes an association look weaker (closer to 1) than it really is, while bias away from the null makes it look stronger. If the ratio of sampling odds is greater than 1, bias is away from the null; if less than 1, bias is toward the null.

Example 11.1: Selection Bias Due to Non-Response

Consider a source population where 10% are exposed, with disease risk of 25% in the exposed and 12% in the non-exposed. If non-response is related to exposure only (30% non-response in exposed, 10% in non-exposed) and unrelated to outcome, the study group RR (2.04) matches the source population RR (2.08) and OR (2.49 vs 2.44), so no bias.

However, if non-response is related to both exposure and outcome (disease risk twice as high in non-responders), then:

Study group RR = 1.73 vs true RR = 2.04 (biased toward the null)
Study group OR = 1.90 vs true OR = 2.38 (biased toward the null)
OR_sf = 0.8, so observed OR = true OR × 0.8 = 2.38 × 0.8 = 1.90

Reflection

Think about a study you have encountered (from previous lessons or your own reading). Could selection bias have affected the results? How would you assess whether the study group was representative of the source population?

Model answerPick a specific study (e.g., the cohort of UK Biobank participants and dementia, or a hospital-recruited case-control of pancreatic cancer). Selection-bias risk: UK Biobank participants are healthier, wealthier, and more educated than the UK population; the ‘source population’ for any analysis is implicitly ‘people willing to enrol in a large biobank,’ not the UK. To assess representativeness: compare baseline distributions of age, sex, region, education, smoking, BMI to national survey data (CCHS, NHANES analogue); compute non-response rates by these strata; perform inverse-probability-of-selection weighting using national survey margins; run sensitivity analyses showing how the effect estimate changes under plausible selection scenarios. For a case-control study, also compare cases to all incident cases in the source population, beyond the sampled ones, to detect referral or selection bias.

Minimum 20 characters required.

✓ Reflection saved

Section 2 of 5

Examples & Reduction of Selection Bias

⏱ Estimated reading time: 25 minutes

Section 2 of 5

Examples & Reduction of Selection Bias

Named patterns from the literature and practical strategies for prevention and evaluation.

Design principle

Same source population

Cohort studies

Non-exposed group must be comparable on other risk factors. A single-cohort design is preferred: both groups come from the same population by definition.

Case-control studies

Controls should reflect exposure prevalence among non-cases in the source population, not among all-comers at a convenient hospital.

Selective entry

Healthy worker effect and healthy donor effect

Healthy worker effect

Workers must be well enough to work. Comparing them to the general population underestimates occupational risk. (McMichael, 1976)

Healthy donor effect

Donors are screened for health. Studies using donors as a comparison group observe lower disease rates than in the true reference population.

Hospital-based studies

Berkson's fallacy

In a hospital-based case-control study, both cases and controls share the common feature of hospital admission, a collider that creates spurious associations between exposure and disease.

Cohort-specific patterns

Loss to follow-up and detection bias

Loss to follow-up

Dropout related to both exposure and outcome biases the risk ratio. Sicker exposed participants leaving produces underestimation of risk.

Detection bias

More intensive screening in exposed participants inflates apparent incidence in that group, biasing the ratio away from the null.

Common solution: ensure equal surveillance effort across all exposure groups throughout follow-up.

Carry forward

Next: how measurement goes wrong

Example 11.2 demonstrates deterministic sensitivity analysis: specify plausible sampling fractions for each cell and compute the adjusted odds ratio.

CRD study result

Observed OR: 2.33 → Adjusted OR: 1.40 after applying plausible sampling fractions. A 67% reduction, illustrating how large the impact can be.

Key prevention principles: same source population, incident cases and concurrent controls, minimized non-response, equal follow-up across groups.

Introduction and Overview

An earlier section defined selection bias in the abstract. This section is the case-study half of the topic. The examples below are the canonical ones, including the healthy worker effect (McMichael, 1976), Berkson's bias (Berkson, 1946/2014), and others that you should be able to recognize in any observational study you read going forward. The reduction strategies that follow them are practical tools you can apply at the design stage.

Learning Objectives

Recognize canonical patterns of selection bias: non-response bias, the healthy worker effect, healthy donor effect, Berkson's bias, and survivor treatment bias.
Explain how loss to follow-up and differential retention produce selection bias in cohort studies.
Choose comparison groups from the same source population to minimize selection bias.
Apply practical reduction strategies (single-cohort design, exhaustive sampling frames, follow-up protocols, sensitivity analyses) at the design and analysis stage.

11.3 Examples of Selection Bias

Selection bias can manifest in many ways across different study designs. Understanding these patterns helps researchers anticipate and prevent bias during the design phase.

11.3.1 Choice of Comparison Groups

A general principle is that study groups should be selected from the same source population. In cohort studies, it is important that the non-exposed group be comparable with respect to other risk factors for the outcome. In case-control studies, the control group should reflect either the prevalence of exposure in the ‘non-case’ members of the population from which the cases arose.

Design Principle

A single-cohort design (where exposed and non-exposed come from the same population) is generally less susceptible to selection bias than a two-group cohort design, since both groups come from the same population by definition.

Types of Selection Bias

Click each card to explore different types of selection bias:

Non-Response BiasClick to explore

Healthy Worker EffectClick to explore

Berkson’s FallacyClick to explore

Loss to Follow-UpClick to explore

Detection BiasClick to explore

Missing Data BiasClick to explore

11.4 Reducing Selection Bias

Prevention Strategies

Be aware of potential pitfalls in selecting study subjects from the proposed source population
In cohort studies, take care when selecting the comparison group and ensure equal follow-up of both exposed and non-exposed groups
Minimise non-response bias, missing data, and detection bias
Case-control studies are particularly susceptible; minimise differential response to study participation between cases and potential controls
Where possible, use only incident cases and obtain controls from the same source population as the cases

Evaluating and Correcting Selection Bias

For valid control of selection bias, one of two conditions must be met:

The factors associated with selection must be antecedents of both exposure and disease, and the distributions must be known in the source population, which allows the bias to be controlled like confounding.
A bias breaker (a variable strongly related to selection and study participation that produces the bias) can be identified. Unbiased estimates of its population distribution can then be obtained, and the ‘corrected’ estimates are not associated with ‘selection’.

Additionally, the potential impact of selection bias can be assessed by examining sampling fractions using deterministic or stochastic sensitivity analysis (as in Example 11.2).

Example 11.2: Evaluating Potential Selection Bias

In a study of childhood respiratory disease (CRD) and regular daycare attendance, the observed OR was 2.33 (95% CI: 1.04–5.19). Using deterministic sampling fractions (sf) to assess the impact of possible selection bias:

Cell	Deterministic sf
Exposed cases (E+D+)	0.5
Non-exposed cases (E−D+)	0.6
Exposed controls (E+D−)	0.05
Non-exposed controls (E−D−)	0.1

The ‘adjusted’ OR (after accounting for the sampling fractions) was 1.40, a 67% reduction from the observed OR. The true association would be considerably weaker than what was observed if this selection bias were present.

Reflection

Consider Berkson’s fallacy in the context of hospital-based case-control studies. Why might using hospital controls lead to biased estimates of the exposure-disease association? Can you think of an example where this might occur?

Model answerBerkson's fallacy in hospital-based case-control: hospital controls have at least one condition that brought them in (otherwise they wouldn't be there). Suppose the exposure of interest is alcohol use and the case is liver cancer; if many of the hospital controls were admitted for other alcohol-related conditions (pancreatitis, falls, hepatitis), the prevalence of alcohol use in controls is artificially high, pulling the OR toward null and underestimating the true risk. Conversely, if controls were admitted for conditions negatively correlated with the exposure, the OR is overestimated. The mechanism: hospital admission is a collider on the exposure–condition path. Example: case-control study of smoking and lung cancer using hospital controls admitted for cardiovascular conditions would underestimate the smoking effect because cardiovascular controls are themselves smokers.

Minimum 20 characters required.

✓ Reflection saved

Section 3 of 5

Information Bias & Misclassification

⏱ Estimated reading time: 25 minutes

Section 3 of 5

Information Bias & Misclassification

How errors in measuring exposure or outcome distort effect estimates, and in which direction.

Core vocabulary

Misclassification and measurement error

Categorical variables: misclassification

Quantified by sensitivity (correctly classifying true positives) and specificity (correctly classifying true negatives).

Continuous variables: measurement error

From systematic inaccuracy or imprecision. Non-differential measurement error biases dose-response relationships toward the null.

Definitions

\[ \color{#0B7B6B}{Se} = P(\text{classified}+ \mid \text{truly}+),\qquad \color{#1D4ED8}{Sp} = P(\text{classified}- \mid \text{truly}-) \]

Se sensitivity: true positives correctly classifiedSp specificity: true negatives correctly classified

Nondifferential exposure errors

Non-differential misclassification of exposure

Non-differential condition

\[ \color{#0B7B6B}{Se_{E|D+}} = \color{#0B7B6B}{Se_{E|D-}} = Se_E \quad \text{and/or} \quad \color{#1D4ED8}{Sp_{E|D+}} = \color{#1D4ED8}{Sp_{E|D-}} = Sp_E \]

Se exposure sensitivity (equal in cases and non-cases)Sp exposure specificity (equal in cases and non-cases)

When \(Se_E + Sp_E > 1\), non-differential errors with dichotomous variables bias measures of association toward the null.

Observed cell counts

\[ \color{#C2410C}{a_1'} = \color{#0B7B6B}{Se_E} \cdot a_1 + (1-\color{#1D4ED8}{Sp_E})\cdot a_0 \qquad \color{#6D28D9}{a_0'} = (1-\color{#0B7B6B}{Se_E})\cdot a_1 + \color{#1D4ED8}{Sp_E} \cdot a_0 \]

a₁′ observed exposed casesa₀′ observed unexposed casesSe_E exposure sensitivitySp_E exposure specificity

Example 11.3: True OR = 3.86; with Se = 0.80, Sp = 0.90 → observed OR = 2.57.

Correcting the bias

Back-calculating true cell counts

Eq 11.3: Correcting non-cases

\[ \color{#C2410C}{b_1} = \frac{\color{#6D28D9}{b_1'} - (1-\color{#1D4ED8}{Sp_E})\cdot m_0}{\color{#0B7B6B}{Se_E} + \color{#1D4ED8}{Sp_E} - 1} \]

b₁ true exposed non-casesb₁′ observed exposed non-casesSe_E sensitivitySp_E specificity

Eq 11.4: Correcting cases

\[ \color{#C2410C}{a_1} = \frac{\color{#6D28D9}{a_1'} - (1-\color{#1D4ED8}{Sp_E})\cdot m_1}{\color{#0B7B6B}{Se_E} + \color{#1D4ED8}{Sp_E} - 1} \]

a₁ true exposed casesa₁′ observed exposed casesSe_E sensitivitySp_E specificity

Exposure misclassification leaves disease-status totals unchanged. Only exposure category totals are affected, so the corrections alter the exposure split, not the disease counts.

Differential errors

Differential misclassification: any direction

Differential condition

\[ \color{#0B7B6B}{Se_{E|D+}} \neq \color{#0B7B6B}{Se_{E|D-}} \quad \text{and/or} \quad \color{#1D4ED8}{Sp_{E|D+}} \neq \color{#1D4ED8}{Sp_{E|D-}} \]

Se exposure sensitivity (differs by case status)Sp exposure specificity (differs by case status)

Nondifferential

Direction predictable: toward the null for dichotomous variables. Analytically tractable.

Differential

Direction unpredictable. Bias can go either way. Classic source: recall bias in case-control studies.

Carry forward

Next: from prevention to correction

Use explicit classification criteria and standardised, trained personnel.
Validate instruments before the main study; apply blinding to prevent equalisation of errors.
Collect specific exposure data; general categories attenuate associations.
Non-differential errors are predictable (toward the null); differential errors are not. Prevention is more tractable than post-hoc correction for the latter.

Introduction and Overview

Earlier sections covered selection bias, the errors arising from who ends up in the study. This section turns to information bias: errors arising from how exposure or outcome is measured once participants are in. The misclassification framework that follows distinguishes nondifferential and differential errors and quantifies their predictable consequences for effect estimates.

Learning Objectives

Define information bias and distinguish misclassification (categorical variables) from measurement error (continuous variables).
Compute and interpret sensitivity and specificity for exposure or outcome classification.
Distinguish nondifferential from differential misclassification and predict the direction of the resulting bias.
Recognize recall bias, interviewer bias, and surveillance bias as common sources of differential misclassification in observational studies.

11.5 Information Bias

The previous discussion was concerned with whether study subjects had the same exposure-disease association as that in the source population. Now we review the effects of incorrectly classifying or measuring the study subjects’ exposure, extraneous factors, and/or outcome status.

When describing errors in classification of categorical variables, the resultant bias is called misclassification bias. The errors can be described in terms of sensitivity (Se) and specificity (Sp):

Sensitivity (Se): the probability that an individual with the event (e.g., exposed) will be correctly classified as having it
Specificity (Sp): the probability that an individual without the event will be correctly classified as not having it

When variables of interest are continuous, classification errors are termed measurement error or bias. The bias can arise from:

A lack of accuracy (systematic bias in the measurement)
A lack of precision (variability in repeated measurements)

Non-differential measurement error tends to bias the dose-response curve towards the null.

11.6 Bias from Misclassification

Misclassification bias results from a rearrangement of study individuals into incorrect categories because of errors in classifying exposure, outcome, or both.

11.6.1 Non-Differential Misclassification of Exposure

If misclassification of the exposure and outcome are independent (i.e., errors in classifying exposure are the same in diseased and non-diseased subjects, and vice versa), the misclassification is called non-differential.

Non-differential exposure misclassification

Se_E|D+ = Se_E|D− = Se_E and/or Sp_E|D+ = Sp_E|D− = Sp_E

Misclassification is non-differential when the sensitivity and specificity of exposure measurement are the same in cases and non-cases. For a binary exposure this biases the association toward the null.

With dichotomous exposures and outcomes, non-differential errors will bias measures of association toward the null (given Se_E + Sp_E > 1). Intuitively, mislabelling some truly exposed people as unexposed, and some truly unexposed as exposed, blends the two groups together, and blending makes them look more alike, which drags the measured association toward no association at all. The observed cell values are a mixture of correctly and incorrectly classified subjects:

True Number	Observed (Incorrectly Classified)
a₁	a₁′ = Se_E·a₁ + (1−Sp_E)·a₀
a₀	a₀′ = (1−Se_E)·a₁ + Sp_E·a₀
b₁	b₁′ = Se_E·b₁ + (1−Sp_E)·b₀
b₀	b₀′ = (1−Se_E)·b₁ + Sp_E·b₀

Important

Exposure misclassification does not affect the disease status totals. Only the exposure category totals change. Relatively small errors (10–20%) can have sizable effects on relative risks.

Example 11.3: Impact of Non-Differential Exposure Misclassification

Consider a study with a true OR of 3.86 (90 exposed cases, 70 non-exposed cases, 210 exposed non-diseased, 630 non-exposed non-diseased). If we assume Se_E = 0.80 and Sp_E = 0.90:

Observed exposed cases: a₁′ = 90×0.8 + 70×0.1 = 79
Observed non-exposed cases: a₀′ = 90×0.2 + 70×0.9 = 81
Observed exposed non-diseased: b₁′ = 210×0.8 + 630×0.1 = 231
Observed non-exposed non-diseased: b₀′ = 210×0.2 + 630×0.9 = 609
Observed OR = (a₁′ × b₀′) / (a₀′ × b₁′) = (79 × 609) / (81 × 231) = 2.57

As predicted, the non-differential errors have reduced the OR from 3.86 to 2.57, a bias toward the null. Notice that the case total (79 + 81 = 160) and the non-diseased total (231 + 609 = 840) are unchanged: misclassifying exposure only shuffles people between the exposed and unexposed columns, which is why the corrected estimate keeps the same disease-status totals we began with.

11.6.2 Evaluating Non-Differential Exposure Misclassification

If the most likely values of Se_E and Sp_E are known, we can correct the observed classifications. Since b₁′ + b₀′ = b₁ + b₀ = m₀, we can solve for the true cell values:

Eq 11.3: Correcting for exposure misclassification

b₁ = [b₁′ − (1 − Sp_E) × m₀] / (Se_E + Sp_E − 1)

Given the sensitivity and specificity of exposure measurement, the true count of unexposed cases is recovered by removing the misclassified contribution and rescaling. The denominator (Se + Sp − 1) must be positive.

Eq 11.4: Correcting exposed cases

a₁ = [a₁′ − (1 − Sp_E) × m₁] / (Se_E + Sp_E − 1)

The same correction recovers the true count of exposed cases from the observed count, using the exposure sensitivity and specificity.

11.6.3 Non-Differential Misclassification of Disease

In cohort studies, with non-differential misclassification of disease:

Non-differential disease misclassification

Se_D|E+ = Se_D|E− = Se_D and/or Sp_D|E+ = Sp_D|E− = Sp_D

Disease misclassification is non-differential when outcome sensitivity and specificity do not depend on exposure status. For a binary outcome this again biases the association toward the null.

There are two components: establishing initial health status (to exclude prevalent cases) and identifying new cases during follow-up. Imperfect sensitivity fails to exclude subjects with the outcome at the study outset; imperfect specificity has less impact.

For binary outcomes, non-differential errors bias the association measure toward the null.

In case-control studies, diagnostic errors applicable to cohort studies do not apply unless Sp_D = 1.00. This is because imperfect disease sensitivity does not bias the RR or IR, and only biases the OR if disease frequency is common.

The key is to verify diagnoses so there are no false positive cases. When Sp_D < 1, non-cases will be included as cases. The case-control sensitivity and specificity differ from the population values:

Eq 11.5 and 11.6: Case-control sensitivity and specificity

Se_cc = Se_D / [(Se_D + sf·(1 − Sp_D))]
Sp_cc = sf·Sp_D / [(1 − Sp_D) + sf·Sp_D]

In a case-control study the effective sensitivity and specificity of the case definition depend on the sampling fraction, which is why external population estimates cannot be used directly to correct misclassification here.

Thus, external estimates of Se_D and Sp_D cannot be used to correct misclassification in case-control studies.

11.6.5 Misclassification of Both Exposure and Disease

When both exposure and disease are misclassified, we need to pay close attention to reducing these errors whenever possible. Most researchers prefer to evaluate errors for the more important one first, conducting a “what if?” analysis one set of errors at a time.

11.6.6 Differential Misclassification

If the errors in exposure classification are related to the status of the outcome under study, the errors are called differential:

Differential exposure misclassification

Se_E|D+ ≠ Se_E|D− and/or Sp_E|D+ ≠ Sp_E|D−

Misclassification is differential when measurement accuracy differs between cases and non-cases (recall bias is the classic example). The resulting bias can run in either direction.

The resulting bias may be in any direction, either exaggerating or underestimating the true association. In case-control studies, recall bias is one common illustration: ‘affected’ subjects (cases) may have increased sensitivity, and perhaps lower specificity, than non-affected subjects in recalling previous exposures.

Figure 11.1. Non-differential misclassification of a binary measure pulls the observed estimate predictably toward the null; differential misclassification can move it either way and by an unknown amount.

11.6.7 Reducing Misclassification Errors

Strategies for Reducing Misclassification

Use clear and explicit guidelines for classification
Have well-trained, consistent research personnel
Double-check exposure and disease status when possible (e.g., lab confirmations, confirmatory records)
Validate the test or survey instrument prior to widespread use
Collect specific rather than general exposure data (to reduce attenuation)
Use blinding techniques so survey personnel cannot equalise errors
Reduce misclassification of extraneous variables (confounders) as well, since poorly measured confounders cannot be fully controlled

Reflection

Why is non-differential misclassification generally considered less “dangerous” than differential misclassification? Under what circumstances might non-differential misclassification still be problematic?

Model answerNon-differential misclassification is less ‘dangerous’ in the sense that the direction of bias is predictable: for a binary exposure with symmetric misclassification, it pulls the OR toward 1, so a non-null finding is at worst an underestimate of the true effect. Differential misclassification (different sensitivities or specificities in exposed vs. unexposed, or in cases vs. controls) can move the estimate in either direction unpredictably, fabricating associations that aren't real. However, non-differential is still problematic when (a) the exposure has more than 2 categories, since bias is not guaranteed toward null in 3+ category exposures, (b) you are testing a null hypothesis (Type II inflation), (c) you need precise effect estimates for policy, or (d) errors in both exposure and outcome compound (the bias can move away from null under specific correlated-error structures).

Minimum 20 characters required.

✓ Reflection saved

Section 4 of 5

Validation, Measurement Error & Correction

⏱ Estimated reading time: 20 minutes

Section 4 of 5

Validation, Measurement Error & Correction

Sub-studies, regression calibration, surrogate measures, and sample-size adjustments.

The workhorse method

Validation sub-studies

Validation direction

Observed → True
\(P(D'=1 \mid D=1)\)
Probability of observed state given true state.

Correction direction

True → Observed
\(P(D=1 \mid D'=1)\)
Probability of true state given observed state.

Critical requirement: sensitivity and specificity must be transportable from the validation dataset to the main study population.

The correction toolkit

Regression calibration

Eq 11.7: Naive model (biased)

\[ \color{#0B7B6B}{Y} = \beta_{0u} + \color{#C2410C}{\beta_{1u}}\color{#6D28D9}{X_1'} + \color{#C2410C}{\beta_{2u}}\color{#6D28D9}{X_2'} \]

Y outcomeβ_u uncorrected (attenuated) slopesX′ error-prone measured exposures

Eqs 11.8-11.9: Calibration equations

\[ \color{#047857}{X_1} = \beta_0 + \color{#BE185D}{\lambda_{11}}\color{#6D28D9}{X_1'} + \color{#BE185D}{\lambda_{12}}\color{#6D28D9}{X_2'},\qquad \color{#047857}{X_2} = \beta_0 + \color{#BE185D}{\lambda_{21}}\color{#6D28D9}{X_1'} + \color{#BE185D}{\lambda_{22}}\color{#6D28D9}{X_2'} \]

X calibrated (true-scale) exposureλ calibration coefficientsX′ measured exposures

Eq 11.10: Calibrated model (less biased)

\[ \color{#0B7B6B}{Y} = \beta_{1rc} + \color{#C2410C}{\beta_{1rc}}\color{#047857}{X_{1rc}} + \color{#C2410C}{\beta_{2rc}}\color{#047857}{X_{2rc}} \]

Y outcomeβ_rc regression-calibration slopesX_rc calibrated exposures

A real-world example

Surrogate exposures: CANUE

CANUE assigns PM_2.5 concentrations by postal-code centroid, a surrogate for true personal exposure.

Sources of error:

Residential mobility across postal codes
Indoor-outdoor concentration differences
Commuting exposure outside the residential area

Result: non-differential measurement error, biasing health-effect estimates toward the null.

Implication

CANUE-based estimates are conservative. The true association between air pollution and health is likely stronger than reported.

Sample size consequences

Adjusting sample size for misclassification

Adjusted observed frequencies

\[ \color{#C2410C}{p_1'} = \color{#0B7B6B}{Se} \cdot p_1 + (1-\color{#1D4ED8}{Sp})(1-p_1) \]\[ \color{#C2410C}{p_2'} = \color{#0B7B6B}{Se} \cdot p_2 + (1-\color{#1D4ED8}{Sp})(1-p_2) \]

p′ observed exposure proportionSe sensitivitySp specificity

Because \(p_1' - p_2' < p_1 - p_2\) under non-differential misclassification, the apparent difference is smaller than the true one.

Use adjusted frequencies \(p_1'\) and \(p_2'\) in the sample-size formula, not the true population values. Failing to do so produces an underpowered study.

Wrapping up

Reading administrative-data studies

Always ask whether the case algorithm was validated in a population similar to the study population (transportability).
CCDSS diabetes definition: Se ~86%, Sp ~99%, well-validated. Mental-health diagnoses in claims: Se often 50-70%, non-differential misclassification problem.
Post-hoc corrections are sensitive to Se/Sp choices. Quantitative bias analysis (Lash et al., 2014) provides the formal framework.

Introduction and Overview

Earlier sections named the bias types and described their predictable effects. This section turns to the practical question of what an investigator can do about them. Validation sub-studies, formal measurement-error models, surrogate-measure adjustments, and sample-size adjustments for misclassification all give the working epidemiologist tools for converting awareness of bias into corrected estimates, the suite of methods now grouped under quantitative bias analysis, summarised in good-practice form by Lash, Fox, MacLehose, Greenland, Maclure, & Poole (2014) building on Greenland's (1996) basic methods for sensitivity analysis of biases and his multiple-bias modelling framework (Greenland, 2005).

Learning Objectives

Design a validation sub-study (two-stage sampling) to estimate sensitivity and specificity directly.
Distinguish validation (observed→true) from correction (true→observed) and apply regression calibration, maximum likelihood, semi-parametric, or Bayesian methods.
Identify error structures in surrogate measures of exposure (e.g., diet recall, occupational records).
Adjust required sample size to compensate for the loss of statistical power caused by nondifferential misclassification.

11.7 Validation Studies to Correct Misclassification

A thorough review of validation studies to correct misclassification identified four main approaches: regression calibration, maximum likelihood, semi-parametric, and Bayesian methods. One key finding is that the more advanced methods are not user-friendly, while ‘simple’ approaches have important limitations.

For validation, we select a subsample of study subjects and verify their exposure and/or disease status. For direct estimates of sensitivity and specificity, we are determining:

Validation: observed given true

p(D′ = 1 | D = 1): probability of the observed state given the true state

Validation moves from truth to measurement: among people who truly have the condition, how often does the measurement record them as positive? This is the measurement sensitivity.

Whereas when correcting for misclassification, we attempt to determine the reverse:

Correction: true given observed

p(D = 1 | D′ = 1): probability of the true state given the observed state

Correction runs the other way: among people the measurement flags as positive, how many truly have the condition? This is the predictive value used to adjust the data.

Two-stage samples (Chapter 10) are useful for validation. We select a subsample and verify their true status to obtain direct estimates of Se and Sp.

Approach	Description	Limitations
Regression Calibration	Use a validation subsample to calibrate measurement errors; regress true values on observed values	Assumes non-differential errors; needs modification for differential errors
Maximum Likelihood	Jointly model the true and observed data using likelihood functions	Complex; not user-friendly
Semi-parametric	Fewer distributional assumptions than maximum likelihood	Still technically demanding
Bayesian	Incorporate prior information about error rates; can use hidden Markov models	Requires specification of priors; can be sensitive to prior choices

Caution: Sensitivity to Error Rate Estimates

Post-hoc adjustments for misclassification are very sensitive to changes in the error rate estimates used. Unless there is an extremely thorough validation procedure, different ‘corrected’ results could arise from a range of apparently sensible choices of the correction factor.

It is very important for the sensitivity and specificity of misclassification to be equivalent (‘transportable’) in the two datasets (validation and study) before attempting to adjust for errors.

Validating Case Definitions in Canadian Administrative Data

When studies use administrative data (physician billings, hospital discharges, prescription claims), cases are identified by an algorithm, not a clinical diagnosis. Each algorithm has been (or should have been) validated against a chart-review or clinical-registry gold standard. Examples:

CCDSS diabetes case definition: one hospital discharge OR two physician claims with an ICD diabetes code in two years; sensitivity ~86%, specificity ~99%, validated against primary-care chart review.
Asthma, COPD, and hypertension definitions used in CCDSS and PopData BC studies similarly have published Se/Sp.
Mental-health diagnoses in claims have notoriously lower sensitivity (often 50–70%) because many cases are managed without a billing event, an important non-differential misclassification problem.

Whenever you read or run an administrative-data study, check whether the case algorithm was validated in a population similar to the study population; transportability is exactly the issue flagged in the caution above.

11.8 Measurement Error

Errors in measuring quantitative factors can lead to biased measures of association. The bias can arise because the variable is not measured accurately (systematic bias) or due to a lack of precision (variability).

Regression Calibration Estimate (RCE)

To introduce the concepts of correcting measurement errors, suppose we have 2 quantitative exposure factors (X₁ and X₂) and a binary or continuous outcome (Y). The uncorrected ‘naive’ model is:

Eq 11.7: Naive model

Y = β_0u + β_1uX₁′ + β_2uX₂′

The naive regression uses the error-prone measured exposure in place of the true exposure, so its slope is attenuated toward zero.

where the subscript ‘u’ indicates the coefficients are biased because the predictor variables (X′) are measured with error. The regression calibration estimate (RCE) involves:

Step 1: Perform a Validation Study

Take a random subset of study subjects and obtain the true values for X₁ and X₂. Regress each true X variable on the set of observed predictor variables:

Eq 11.8 and 11.9: Calibration model

X₁ = β₀ + λ₁₁X₁′ + λ₁₂X₂′
X₂ = β₀ + λ₂₁X₁′ + λ₂₂X₂′

A calibration (or validation) sub-study regresses the true exposure on the measured one, supplying the factors needed to undo the attenuation.

Step 2: Predict and Regress

Calculate the estimated (predicted) X values for all study subjects (X_1rc and X_2rc) using the calibration equations. Then regress Y on these estimated values:

Eq 11.10: Calibrated model

Y = β_1rc + β_1rcX_1rc + β_2rcX_2rc

Regression calibration plugs the calibrated exposure back into the outcome model, recovering a slope closer to the true association.

The coefficients β_1rc should provide less biased estimates of the true X–Y association than the naïve estimates. Standard errors need to be adjusted for the calibration process.

11.9 Errors in Surrogate Measures of Exposure

Often, epidemiologists study the effects of a complex exposure using surrogate measures. For example, in air pollution studies, what is the ‘appropriate’ measure? It could be a complex mixture of agents, doses, and durations.

Key Considerations for Surrogate Measures

Should exposure be measured on a continuous scale (preferred) or categorised as dichotomous/ordinal?
If specific agents are highly correlated, which one should be analysed, or should a composite variable be created?
Even if variables are measured “without error,” they may still be surrogates that fail to reflect true exposure
One solution: ask about the effects of measurable components (e.g., sulphur dioxide) rather than the broad concept (“air pollution”)

Surrogate Exposures in Practice: CANUE

The Canadian Urban Environmental Health Research Consortium (CANUE) assigns environmental exposures to subjects by their postal code. This is a textbook surrogate-measure problem:

The CANUE PM_2.5 value is the modelled annual average at the postal-code centroid, not what the participant actually breathed.
People move; postal codes change; indoor/outdoor differences are large; commuting exposes people to pollution outside their residential postal code.
The result is non-differential exposure measurement error, biasing health-effect estimates toward the null (the same direction discussed in 11.6).

This does not mean CANUE-based studies are wrong; it means their estimates are conservative and should be interpreted with the surrogate-measure framework above.

11.10 Impact of Information Bias on Sample Size

Classification and measurement errors can have a serious impact on measures of association. With non-differential misclassification, measures are biased toward the null; with classical measurement error models, the same is true for continuous variables. This leads to an important conclusion:

Sample Size Implications

The projected loss of power due to information errors should be considered and the sample size increased accordingly. The formulae for sample size estimation assumed that p₁ and p₂ were true population levels. However, with an imperfect test, the observed disease frequencies would be:

p₁′ = Se·p₁ + (1 − Sp)(1 − p₁)
p₂′ = Se·p₂ + (1 − Sp)(1 − p₂)

The difference p₁′ − p₂′ is usually less than p₁ − p₂, and it is the adjusted estimates that should be used to calculate sample size. Obuchowski (1998) generalises sample-size estimation to account for misclassification, response bias, and other features of clinical trials.

Summary: Types of Information Bias

Type	Definition	Direction of Bias	Example
Non-differential misclassification	Classification errors are equal across comparison groups	Toward the null (for dichotomous variables)	Self-reported smoking with same error rate in cases and controls
Differential misclassification	Classification errors differ by disease or exposure status	Any direction (unpredictable)	Recall bias in case-control studies
Non-differential measurement error	Errors in continuous variables are equal across groups	Toward the null	Random variability in blood pressure readings
Misclassification of confounders	Errors in measuring extraneous variables	Incomplete control of confounding	Poorly categorised socioeconomic status

R Activity: E-values for unmeasured confounding + bias-adjusted OR

The companion R script r-activities/HSCI_341_Lesson_11_Validity_in_Observational_Studies.R walks through two quantitative bias analyses: (A) compute the E-value for an observed OR and HR using the EValue package (VanderWeele & Ding, 2017), and (B) apply Greenland's simple bias-adjustment for non-differential exposure misclassification to see how observed effects compare to corrected effects. A complementary diagnostic for residual confounding is the negative-control design (Lipsitch, Tchetgen Tchetgen, & Cohen, 2010).

# PART A -- E-values for unmeasured confounding
library(EValue)

# Study reports OR = 2.4 (95% CI 1.6, 3.5); treat as rare-disease RR
evalues.OR(est = 2.4, lo = 1.6, hi = 3.5, rare = TRUE)

# Same idea for a hazard ratio
evalues.HR(est = 1.8, lo = 1.3, hi = 2.5, rare = FALSE)

# PART B -- Greenland's simple bias adjustment for misclassification
correct_OR <- function(OR_obs, Se = 0.95, Sp = 0.95, P_E_unexp = 0.30) {
  numer <- OR_obs * (Sp + Se - 1)
  denom <- (Sp + Se - 1) - (1 - Sp)*OR_obs - (1 - Se)
  numer / denom
}
correct_OR(OR_obs = 2.4, Se = 0.85, Sp = 0.95)

What you should be able to do after this activity: compute and interpret an E-value (for point estimate AND CI bound), and apply a simple misclassification correction to an observed OR to see how the corrected estimate compares.

R Reflect on what you just ran

Use the questions below to interpret the actual numbers evalues.OR(), evalues.HR(), and correct_OR() produced. Look at your console output before answering.

1. From evalues.OR(2.4, lo = 1.6, hi = 3.5, rare = TRUE), report both E-values (point estimate and CI lower bound). Explain in one sentence what each one means in terms of the strength of confounding required to nullify the finding.

Model answerevalues.OR(2.4, lo=1.6, hi=3.5, rare=TRUE) returns E-value ≈ 4.24 for the point estimate and ~2.58 for the lower CI bound. The point E-value means: an unmeasured confounder would need an OR of at least 4.24 with both the exposure AND the outcome (above and beyond measured confounders) to fully explain away the observed OR of 2.4. The CI-bound E-value (2.58) is the smaller hurdle to render the lower CI compatible with the null: a more easily achieved confounding strength than the point E-value. Together they tell you how robust the finding is to unmeasured confounding.

2. The HR example (HR = 1.8) returned a smaller E-value than the OR example. Why does a weaker observed effect lead to a smaller E-value, and what does that imply about how easy it would be to explain the HR away with an unmeasured confounder?

Model answerA weaker observed effect (HR 1.8 vs. OR 2.4) requires a weaker unmeasured confounder to nullify it; the E-value scales monotonically with effect size. Algebraically E-value ≈ RR + √(RR(RR−1)), so smaller RR → smaller E-value. For HR = 1.8, E-value ≈ 3.0; that means a confounder with associations of ~3-fold with both exposure and outcome would suffice. Implication: the HR result is more vulnerable to an unmeasured confounder of modest strength than the OR = 2.4 result. The weaker an observational finding, the less robust it is to plausible alternative explanations.

3. correct_OR(2.4, Se = 0.85, Sp = 0.95) returned a bias-adjusted OR. Was it larger or smaller than 2.4, and why does non-differential exposure misclassification typically bias the OR toward the null? What would correct_OR(2.4, Se = 1, Sp = 1) equal, and why?

Model answercorrect_OR(2.4, Se=0.85, Sp=0.95) returns a bias-adjusted OR larger than 2.4, typically around 2.9–3.1. Non-differential exposure misclassification dilutes the observed OR toward 1.0 (the null) because mislabelling some truly-exposed as unexposed (and vice versa) mixes the two groups and reduces the contrast between them. The correction backs out the ‘true’ OR by inverting the misclassification matrix. correct_OR(2.4, Se=1, Sp=1) equals exactly 2.4: when both sensitivity and specificity are perfect, no correction is needed because there is no misclassification to undo.

Saved.

Reflection

Consider the practical challenges of conducting a validation study to correct for misclassification. Why might it be difficult to obtain “true” values, and how could the sensitivity of corrections to error rate estimates affect your confidence in the adjusted results?

Model answerObtaining ‘true’ values for validation is hard because the gold standard is often expensive, invasive (biopsies, recovery biomarkers), or impossible (recall of long-ago exposures). Sample size for validation studies is constrained because each validated subject is costly, so estimates of Se and Sp themselves have wide CIs. Sensitivity to misspecification: if your assumed Se = 0.85 is off by ± 0.05, the corrected OR can shift considerably (correction is non-linear in Se/Sp near boundaries). Consequence: an adjusted OR appears confident but its real CI is much wider than reported once Se/Sp uncertainty propagates. Best practice: include the validation sub-sample's standard errors in a bootstrap analysis of the corrected estimate; present a range of plausible corrections; never report a single ‘corrected’ OR without quantifying the additional uncertainty introduced by the correction.

Minimum 20 characters required.

✓ Reflection saved

HSCI 341, Lesson 11

Fundamental Epidemiological Concepts and Approaches

Validity in Observational Studies

Learning objectives for this lesson:

Glossary: Key Terms, People & Concepts

Introduction & Selection Bias

Validity in Observational Studies

Internal validity, external validity, generalisability

Internal validity

External validity

Generalisability

How selection bias enters: two scenarios

Quantifying selection bias

A parallel formulation

Example 11.1 in brief

Next: the named patterns

Introduction and Overview

Learning Objectives

11.1 Introduction to Validity

Key Concept: Internal vs. External Validity

The Three Major Types of Bias

11.2 Selection Bias

Bias Variables and DAGs

Sampling Fractions & Sampling Odds

Key Insight

Sampling Odds

Reflection

Examples & Reduction of Selection Bias

Examples & Reduction of Selection Bias

Same source population

Cohort studies

Case-control studies

Healthy worker effect and healthy donor effect

Healthy worker effect

Healthy donor effect

Berkson's fallacy

Loss to follow-up and detection bias

Loss to follow-up

Detection bias

Next: how measurement goes wrong

CRD study result

Introduction and Overview

Learning Objectives

11.3 Examples of Selection Bias

11.3.1 Choice of Comparison Groups

Design Principle

Types of Selection Bias

11.4 Reducing Selection Bias

Reflection

Information Bias & Misclassification

Information Bias & Misclassification

Misclassification and measurement error

Categorical variables: misclassification

Continuous variables: measurement error

Non-differential misclassification of exposure

Back-calculating true cell counts

Differential misclassification: any direction

Nondifferential

Differential

Next: from prevention to correction

Introduction and Overview

Learning Objectives

11.5 Information Bias

11.6 Bias from Misclassification

11.6.1 Non-Differential Misclassification of Exposure

Important

11.6.2 Evaluating Non-Differential Exposure Misclassification

11.6.3 Non-Differential Misclassification of Disease

11.6.5 Misclassification of Both Exposure and Disease

11.6.6 Differential Misclassification

11.6.7 Reducing Misclassification Errors

Reflection

Validation, Measurement Error & Correction

Validation, Measurement Error & Correction

Validation sub-studies

Validation direction

Correction direction

Regression calibration

Surrogate exposures: CANUE

Implication