Foundations of
Epidemiology
Evaluating Epidemiological Research — HSCI 230
Kiffer G. Card, PhD, Faculty of Health Sciences, Simon Fraser University
Learning objectives for this lesson:
- Trace the development of epidemiology from ancient plagues to modern public-health surveillance
- Identify foundational figures, methods, and innovations that shaped the discipline
- Critically examine the colonial, racial, and political origins of epidemiological knowledge
- Use Actor-Network Theory to explain how scientific advances emerge from networks of people, technologies, and institutions
- Distinguish epistemology, ontology, and axiology, and locate epidemiology among the major research paradigms
- Evaluate the strengths and limits of Western quantitative science as a way of knowing
- Describe how Indigenous knowledge systems, lived experience, and qualitative methods contribute to public-health evidence
- Define research misconduct, questionable research practices, and the manufactured-doubt strategy used by industry
- Explain the replication crisis and the open-science reforms (preregistration, registered reports, data and code sharing) it has prompted
- Apply OCAP® and EGAP principles to evaluate the trustworthiness and ethical conduct of public-health research
This course was developed by Dr. Kiffer G. Card, Faculty of Health Sciences, Simon Fraser University.
Glossary — Key Terms, People & Concepts
📚 Reference page — available throughout the lesson
This glossary collects the key concepts, people, and ideas you will meet in this lesson. Use it as a reference while you work through the material, or as a review before assessments. Type in the search box to filter entries.
History of Epidemiology
⏱ Estimated reading time: ~40 minutes
Introduction and Overview
To understand what epidemiology is today, it helps to see how it became a discipline. In this section we will travel from the ancient Mediterranean to the cholera-stricken streets of Victorian London, pausing at four turning points: the first naturalistic theories of disease, the first systematic counts of the dead, the first vaccines, and the sanitary investigations of the 19th century. Each step adds a new tool to the epidemiological toolkit. Just as importantly, each step shows that progress depends on more than a single brilliant mind — it depends on the alignment of people, institutions, and technologies. We will return to that idea, the actor-network view of science, again and again.
Section 1 ended with a 19th-century discipline that knew how to gather data and use it to argue for sanitation, vaccination, and reform. This section asks what happened next. We will trace four phases of 20th-century development, look at the data behind the claim that the world has gotten dramatically healthier, and meet the actor-network perspective again — this time as a way of understanding why those gains were possible at all. By the end of the section you should be able to defend a careful, evidence-based version of the optimistic story, while also being primed to ask, in Section 3, who has been left out of it.
This section reads against the grain of Sections 1 and 2. We will start by introducing a conceptual lens — Foucault's idea of biopower — that lets us see population-level health management as a form of political power. We then revisit the very same period covered earlier (the colonial 18th and 19th centuries, the modern 20th) and ask what the standard story leaves out: which populations supplied the bodies, the suffering, and the data that produced "modern epidemiology." We will close with a contemporary example showing that this is not just a historical concern — the health disparities we measure today still bear the imprint of the systems we are about to discuss.
Learning Objectives
- Trace the development of epidemiology from its ancient roots through the 19th-century sanitary era and into the chronic-disease and molecular eras of the 20th and 21st centuries.
- Explain how networks of people, institutions, and technologies — not lone geniuses — produce advances in public health.
- Use evidence to evaluate claims about global health progress, including the shift from infectious to chronic disease.
- Apply Foucault’s concepts of biopower and biopolitics to analyse how colonialism, slavery, and war shaped the development of epidemiological knowledge.
- Connect specific historical cases of research exploitation to contemporary patterns of health inequity.
Ancient Roots: Disease and Environment
Long before the germ theory of disease, humans tried to make sense of why some people fell ill and others did not. The earliest epidemiological thinking emerged from the recognition that disease is not random — it follows patterns connected to the environment, seasons, and ways of life.
Hippocrates (c. 460–377 BC) is often cited as the first epidemiologist. In his treatise Airs, Waters, and Places, he urged physicians to consider the effects of climate, water quality, and geography on health. He introduced the terms epidemic (diseases that visit a community) and endemic (diseases that reside within a community), distinctions that remain foundational today.
Key Concept: Environment and Disease
Hippocrates' insight was radical for its time: rather than attributing disease to the wrath of gods, he proposed that illness had natural causes tied to environmental conditions. This shift — from supernatural to natural explanations — was one of the most consequential intellectual moves in the history of medicine.
For centuries after Hippocrates, the dominant Western theory of disease was miasma theory — the idea that illness was caused by "bad air" rising from rotting organic matter and swamps. While wrong in its mechanism, miasma theory was productive: it motivated sanitation reforms, clean water infrastructure, and drainage projects that genuinely improved health, sometimes for the right reasons applied in the wrong theoretical framework.
Hippocratic and miasmatic thinking gave epidemiology its first big idea — that disease has natural, environmental causes — but it gave us no way to measure health at the level of a population. For that, we have to wait roughly two thousand years, until parish clerks in 17th-century London started keeping careful records of who had died, and of what.
The 17th Century: Counting the Dead
The emergence of epidemiology as a quantitative discipline required a seemingly simple innovation: the systematic recording of births and deaths.
John Graunt (1620–1674), a London haberdasher with no formal medical training, published Natural and Political Observations Made upon the Bills of Mortality in 1662. Graunt analyzed London's weekly death records and discovered remarkable patterns: the regularity of sex ratios at birth, seasonal variations in mortality, and differences in urban versus rural death rates. He constructed the first known life table, estimating the probability of survival to each age.
Actor-Network Insight: Bureaucracy as a Tool of Knowledge
Graunt's breakthrough was only possible because of an existing bureaucratic infrastructure: the London Bills of Mortality, which had been compiled since 1532, originally to track plague outbreaks. The administrative apparatus of parish clerks recording deaths, week by week, created a dataset that no individual physician could have assembled. This is a pattern we will see repeatedly: epidemiological advances depend on networks of institutions, technologies, and actors — not just individual brilliance.
Graunt did his work with a quill pen and a ledger; today, every analysis you will read in this course was produced with software. Before we move on, it is worth pausing to see how the kind of arithmetic Graunt did by hand looks in modern practice. The next box is the first of many short hands-on detours into R, the statistical language used throughout epidemiology. Treat these boxes as optional but encouraged: they are how the conceptual material in this course connects to the work of an actual analyst.
What you'll do: reproduce Graunt's 1662 life table in five lines of code, then plot the survival curve it implies. What to take away: the same arithmetic that took Graunt months can now be done in seconds — and the data structures you meet here (vectors, data frames) are the same ones you will use for every analysis in HSCI 341 and 410.
Throughout this course, you will see orange “R” boxes like this one. They show how the concepts you are learning translate into R — the open-source statistical language used by epidemiologists worldwide. You don't need to be fluent in R to follow this course, but exposure now will pay dividends later. Install R from cran.r-project.org and the friendlier RStudio Desktop from posit.co/download/rstudio-desktop. Open RStudio, paste the code below into the Console (or a script), and press Enter.
# Graunt's 1662 life table: of 100 people born, how many survive to each age?
# These are Graunt's original (rough) figures, written into two vectors.
age <- c(0, 6, 16, 26, 36, 46, 56, 66, 76, 80)
survivors <- c(100, 64, 40, 25, 16, 10, 6, 3, 1, 0)
# Bundle them into a data.frame — R's table-like object.
graunt <- data.frame(age = age, survivors = survivors)
# Probability of surviving from birth to each age:
graunt$survival_prob <- graunt$survivors / 100
# Print and plot the life table:
print(graunt)
plot(graunt$age, graunt$survival_prob,
type = "b", pch = 19,
xlab = "Age (years)",
ylab = "Probability of surviving from birth",
main = "Graunt's 1662 Life Table for London")
What just happened? The arrow <- assigns a value to a name; c() combines values into a vector; data.frame() builds a table; $ picks a column. You just did, in five lines, what took Graunt months of manual tabulation. The same logic underlies modern actuarial tables and survival analysis.
R Reflect on what you just ran
Use the questions below to interpret the output you produced. Look at your console / plot before answering.
1. Look at the printed graunt data frame. What is the survival probability at age 16, and how does it compare to the survival probability at age 26? What does the difference suggest about childhood mortality in 17th-century London?
survival_prob column reads 0.40 (40 of 100 original births still alive); at age 26 it reads 0.25 (25 alive). So 15 out of every 100 children born had died between their 16th and 26th birthdays — but a much larger drop has already happened earlier (60% of the original cohort is gone by age 16). The combined picture is brutal: surviving to working age in 17th-century London required outrunning a death-rate that was front-loaded into early childhood. Graunt's data is what convinced 1660s readers that infant and child mortality, not just adult disease, drove population dynamics — the founding intuition of demographic epidemiology.2. Examine the plot. Between which two ages does the survival curve drop the most steeply? What does that tell you about the riskiest part of the lifespan in Graunt's data?
3. Now look at the conditional survival you computed in the stretch challenge. Which age interval had the lowest conditional survival - i.e., the riskiest decade of life given that you had already reached the start of it?
Graunt showed how the dead could be made to speak. The next century brought a different kind of breakthrough — not a way of counting illness, but a way of preventing it.
The 18th Century: Vaccination and Empirical Observation
Edward Jenner (1749–1823) is celebrated for developing the first vaccine in 1796, when he demonstrated that inoculation with cowpox material protected against smallpox. But Jenner's innovation, too, emerged from a broader network. Dairy workers had long observed that milkmaids who contracted cowpox seemed immune to smallpox — this was folk knowledge, circulating informally for decades before Jenner tested it systematically.
Jenner's contribution was to formalize this observation into an experimental test and to advocate for its widespread adoption. The success of vaccination also depended on state infrastructure: governments needed to organize distribution, manage public trust, and enforce compliance. In 1840, Britain made vaccination free, and in 1853 it became compulsory — an early example of how public health requires the coordination of science, governance, and social institutions.
A Note on Whose Knowledge Counts
Jenner's story also illustrates a recurring tension in the history of science: the folk knowledge of working-class dairymaids was essential to the discovery, yet it is Jenner — the physician who formalized and published the finding — who receives the credit. Throughout the history of epidemiology, we will encounter this pattern of knowledge extraction, where the observations and experiences of ordinary people (often marginalized communities) become the raw material for scientific advances attributed to elite professionals. We will return to this pattern in Section 3.
By the close of the 18th century, epidemiology had its first set of working tools: a naturalistic theory of disease, a way of counting deaths, and a way of preventing at least one infection. What it still lacked was an organized professional community willing to use those tools to confront the urban epidemics that the Industrial Revolution was about to unleash. That community took shape in the 19th century.
The 19th Century: The Golden Age of Sanitation
The 19th century saw the emergence of epidemiology as a recognizable discipline, driven by industrialization, urbanization, and devastating epidemics of cholera, typhus, and tuberculosis.
Four figures dominate the textbooks of this era: John Snow, Ignaz Semmelweis, William Farr, and Florence Nightingale. Each tackled a different problem — cholera, childbed fever, mortality classification, military health — but together they show the era's defining move: turning systematic data into arguments for change. Click through the tabs below to meet each figure, and watch for what they share rather than only what made them famous.
John Snow (1813–1858)
Often called the "father of modern epidemiology," Snow famously traced a London cholera outbreak in 1854 to a contaminated water pump on Broad Street. By carefully mapping cases and comparing water sources, he demonstrated that cholera was spread through contaminated water — not miasma.
Snow's work was groundbreaking because it used systematic evidence to identify a cause of disease, even without understanding the underlying biological mechanism (the cholera bacterium would not be identified until 1884). His approach — mapping cases, comparing exposed and unexposed groups, identifying a specific source — established the basic logic that epidemiology still follows.
However, Snow's work did not happen in a vacuum. His investigation depended on the Registrar General's mortality records (compiled by William Farr), on the cooperation of local officials, and on a growing public demand for sanitary reform. The removal of the Broad Street pump handle was as much a political act as a scientific one.
Click START, then walk through the 7 scenes at your own pace using the Next ▶ button. Each scene loops until you advance.
A 7-scene pixel-art retelling of the 1854 Broad Street investigation: the outbreak, Snow's door-to-door survey, the spot map, the contaminated pump, and the famous handle removal that helped found modern epidemiology.
Ignaz Semmelweis (1818–1865)
Working in the maternity wards of Vienna General Hospital, Semmelweis observed that the mortality rate from puerperal (childbed) fever was dramatically higher in the ward staffed by physicians than in the ward staffed by midwives. He hypothesized that physicians were transmitting "cadaverous particles" from the autopsy room to laboring women.
In 1847, Semmelweis introduced mandatory handwashing with chlorinated lime solution, and the mortality rate dropped from approximately 10–18% to around 1–2%. Despite this dramatic evidence, his findings were rejected by much of the medical establishment, who were offended by the suggestion that their hands could be instruments of death.
Semmelweis's story illustrates how institutional resistance, professional ego, and entrenched hierarchies can delay the adoption of life-saving knowledge. His ideas were only widely accepted after germ theory was established by Pasteur and Koch decades later.
Click START, then walk through the 7 scenes at your own pace using the Next ▶ button. Each scene loops until you advance.
A 7-scene retelling of the 1847 Vienna General Hospital discovery: the two-ward mortality gap, the autopsy-room clue, the chlorinated-lime intervention, the dramatic drop in maternal deaths, and the establishment's tragic rejection.
William Farr (1807–1883)
As the first compiler of statistical abstracts for the Registrar General of England and Wales, Farr developed standardized methods for classifying diseases and analyzing mortality rates. He created techniques for comparing death rates across populations — the forerunner of modern vital statistics and disease surveillance.
Farr's work made epidemiology quantitative and systematic. By standardizing how diseases were categorized and deaths were recorded, he built the infrastructure that allowed Snow and others to do their investigative work. Farr also demonstrated the concept of excess mortality — comparing observed deaths to expected deaths — a technique that remains central to epidemiology today.
Click START, then walk through the 7 scenes at your own pace using the Next ▶ button. Each scene loops until you advance.
A 7-scene retelling of how Farr built the data infrastructure of epidemiology at the Registrar General's Office: standardizing disease classification, computing death rates, inventing excess mortality, and enabling everyone who came after.
Florence Nightingale (1820–1910)
Nightingale is best known as a nursing reformer, but she was also a pioneering statistician who used data visualization to advocate for public health reform. Her famous "coxcomb" (polar area) diagrams showed that far more British soldiers in the Crimean War died from preventable infectious disease than from battle wounds.
After the war, Nightingale turned her statistical methods to studying disease in British India, collaborating with investigators, publishing papers, and developing theories about sanitation and disease transmission. She was among the first to use statistical evidence systematically to drive policy change, demonstrating that data could be a powerful tool for advocacy.
Click START, then walk through the 7 scenes at your own pace using the Next ▶ button. Each scene loops until you advance.
A 7-scene retelling of the Crimean War sanitary crisis: arrival at Scutari Barrack Hospital, the disease-vs-wounds gap, the iconic coxcomb diagram, and how data visualization drove parliamentary reform.
Notice the common thread across all four. None of these figures worked alone, and none worked from theory alone. Snow needed Farr's mortality records; Nightingale needed the army's casualty registers; Semmelweis needed a hospital that already kept ward-by-ward statistics. Each took an existing stream of data and turned it into a public argument that something must change. That move — from numbers to action — is what marks the 19th century as the discipline's first real flowering, and it is the move that 20th-century epidemiologists will inherit and refine.
Networks of Progress: An Actor-Network Perspective
If you take a step back from the four figures you just met, you can see why the standard textbook way of telling this story — as a parade of great men — is incomplete. A common way to tell the history of epidemiology is as a series of "great men" moments — Hippocrates, Graunt, Jenner, Snow. But this narrative obscures the networks of actors that made each breakthrough possible.
Actor-Network Theory (ANT), developed by sociologists Bruno Latour and Michel Callon, offers a richer framework. ANT proposes that scientific advances emerge from networks that include not just human actors (scientists, bureaucrats, patients) but also non-human actors: technologies (the microscope, the printing press, the water pump), institutions (the Registrar General's office, hospitals, parish churches), documents (bills of mortality, death certificates), and even organisms (cholera bacteria, cowpox virus). No single actor produces knowledge alone; innovation is the product of a network.
Thinking with ANT
Consider the Broad Street pump investigation. Snow's "discovery" required: (1) a bureaucratic system producing mortality records; (2) cartographic technology to map cases spatially; (3) a cultural context of sanitary reform that made people receptive to environmental explanations; (4) a water infrastructure that connected households to identifiable pumps; (5) local knowledge from residents about their water habits; and (6) political will to act on the findings. Remove any node from this network and the "breakthrough" does not happen.
This perspective is important because it helps us see that progress in public health is never inevitable. It depends on the alignment of scientific knowledge, technological capacity, institutional infrastructure, cultural readiness, and political will. When these networks function well, health improves. When they break down — or when they are organized to benefit some populations and not others — people suffer.
Where this leaves us. By the close of the 19th century, epidemiology had a recognizable identity: a discipline that gathers population data, looks for patterns, and uses what it finds to argue for change. Section 2 picks up that story in the 20th century, when the focus shifts from infectious outbreaks to chronic disease and when the discipline begins to make a credible claim that the world is, on many measures, getting healthier. Before moving on, take a moment with the knowledge check below to consolidate the names, dates, and ideas from this section.
The Evolution of Modern Epidemiology
The 20th century transformed epidemiology from a discipline focused primarily on tracking infectious disease outbreaks into a sophisticated science of population health (Susser & Susser, 1996a, 1996b). This transformation occurred in recognizable phases, each building on the networks of knowledge, technology, and institutions established by predecessors. The accordion below walks through these phases in order; expand each one to see how the discipline's methods and questions evolved, and notice how each phase reuses infrastructure built in the previous one.
Click START, then walk through the 6 scenes at your own pace using the Next ▶ button.
A 6-scene historical montage tracing the postwar pivot: the smoking-and-lung-cancer mystery, Doll & Hill's (1950) case-control study, the 1948 Framingham cohort, decades of risk-factor discovery, and the 1964 Surgeon General's report that established chronic-disease epidemiology.
This phase, which we covered in Section 1, established the foundational practices of systematic data collection. The Bills of Mortality, vital registration systems, and census taking created the raw materials that epidemiology would analyze. The key innovation was not any single discovery but the creation of bureaucratic systems that generated population-level data over time.
The early 20th century saw the development of formal epidemiological investigation methods. In 1927, Kermack & McKendrick (1927) published the SIR (Susceptible-Infected-Recovered) model, establishing mathematical techniques for predicting the spread of infectious diseases through populations. This model — and its many descendants — would later become essential tools during HIV/AIDS, SARS, and COVID-19.
This era also saw the rise of occupational epidemiology, studying how working conditions affected health, and the beginnings of cohort study methodology.
As infectious diseases were brought under better control through antibiotics, vaccination, and sanitation, epidemiology turned its attention to chronic diseases — heart disease, cancer, stroke, diabetes — which became the leading causes of death in industrialized nations.
The landmark study of this era was the British Doctors' Study, launched in 1951 by Richard Doll & Austin Bradford Hill (1954). By following over 40,000 physicians over decades, they established the causal link between cigarette smoking and lung cancer — one of the most consequential findings in public health history. This study also helped establish the prospective cohort study as a powerful tool for investigating chronic disease causation.
In the United States, the Framingham Heart Study, begun in 1948, similarly followed an entire community over time and identified major cardiovascular risk factors including high blood pressure, high cholesterol, smoking, obesity, and diabetes. These are concepts and risk factors you have almost certainly heard of — they became common knowledge because of this era of epidemiological research.
Bradford Hill also formalized the Bradford Hill criteria (Hill, 1965) for evaluating causation from epidemiological evidence — a set of considerations (strength, consistency, specificity, temporality, biological gradient, plausibility, coherence, experiment, and analogy) that remain widely used today.
Since the mid-1970s, epidemiology has become increasingly methodologically sophisticated. Key developments include:
- Causal inference methods: Directed Acyclic Graphs (DAGs), counterfactual reasoning, and potential outcomes frameworks for reasoning about causation more rigorously
- Molecular epidemiology: Integration of genetic and biomarker data to understand disease mechanisms at the molecular level
- Social epidemiology: Formal study of how social structures, institutions, and inequalities shape health outcomes
- Global health epidemiology: Tracking and responding to pandemics, from HIV/AIDS to SARS to COVID-19
- Data science and computational methods: Machine learning, large-scale administrative data linkage, and real-time disease surveillance systems
The randomized controlled trial (RCT), while first developed in the late 1940s (the 1948 streptomycin trial for tuberculosis is considered the first modern RCT), became established as the "gold standard" for testing interventions during this period.
Across these four phases, two patterns are worth holding on to. First, the discipline broadened: from infectious disease to chronic disease, from individual outbreaks to global pandemics, from descriptive statistics to formal causal inference. Second, the discipline got better at quantifying its own claims — and that mattered, because it set the stage for a different kind of question: not just "what causes disease?" but "are we, collectively, making any progress against it?"
The Case for Optimism: Evidence-Based Progress
Looking at the arc of history, there is a strong — if sometimes surprising — case to be made that the world has gotten dramatically healthier, wealthier, and safer. This is the argument made powerfully by the late Swedish physician and statistician Hans Rosling in his book Factfulness: Ten Reasons We're Wrong About the World – and Why Things Are Better Than You Think (2018).
Rosling argued that most people, including the highly educated, hold systematically distorted views of the world — believing things are worse than they actually are. His work centered on letting the data tell the story. Consider some of the evidence below: each card flips open to a short data summary on a different dimension of global health and well-being. As you click through, ask yourself which numbers surprise you, and try to notice when your gut reaction differs from what the data actually show.
Rosling's Core Message
Rosling did not argue that the world is fine or that we should be complacent. He argued that the world is both better than most people think and still deeply flawed — and that recognizing progress is essential for making further progress. If we believe nothing works, we become fatalistic. If we see that vaccination campaigns, sanitation investments, and education programs have produced measurable improvements, we have reason to invest further in evidence-based interventions.
His key insight for epidemiology students: data literacy matters. The ability to read data accurately, resist cognitive biases, and communicate evidence clearly is itself a public health skill.
That last point — the importance of working with the numbers yourself rather than taking them on trust — is best learned by doing. The next R box invites you to reproduce Rosling's headline child-mortality claim from a handful of data points. The point is not the chart; the point is that you can interrogate, verify, and re-tell a global statistic in a few lines of code.
What you'll do: plot global under-5 mortality from 1800 to 2020 and compute the total drop. What to take away: Rosling's headline number isn't magic — it is a few rows of data and one line of arithmetic, and you can now produce it (and challenge it) yourself.
Rosling's argument rests on real numbers. Below we use a few hand-entered data points (you could easily replace them with the full Gapminder series) to recreate the kind of trend chart he made famous.
# Approximate global under-5 mortality rate (% of children dying before age 5)
# Source: Gapminder / UN IGME estimates, rounded for illustration.
year <- c(1800, 1900, 1950, 1980, 2000, 2020)
u5_pct <- c(43.0, 36.0, 22.5, 12.0, 7.6, 3.7)
# A quick visual: black line + red points.
plot(year, u5_pct, type = "l", lwd = 2,
ylim = c(0, 50),
xlab = "Year",
ylab = "Under-5 mortality (%)",
main = "Global Child Mortality, 1800-2020")
points(year, u5_pct, pch = 19, col = "firebrick")
# Average annual reduction (rough): how many percentage points per decade?
total_drop_pct <- u5_pct[1] - u5_pct[length(u5_pct)]
years_span <- max(year) - min(year)
cat("Drop of", total_drop_pct, "percentage points over", years_span, "years.\n")
A handful of lines of code lets you reproduce — and interrogate — the headline claim of an entire book. That is the skill epidemiology asks of you: don't take a number at face value; compute it for yourself and look at how it was constructed.
R Reflect on what you just ran
Use the questions below to interpret the output you produced. Look at your console / plot before answering.
1. The cat() line printed a single sentence: "Drop of X percentage points over Y years." Fill in the values. What is the average drop per decade?
total_drop_pct = 43.0 − 3.7 = 39.3 percentage points; years_span = 2020 − 1800 = 220 years. The average drop is 39.3 / 22 ≈ 1.79 pp per decade. That single number is what Rosling argued the public most consistently underestimates: a roughly 11-fold reduction in under-5 mortality — the largest improvement in a single human health indicator in recorded history — played out at a rate slow enough that no single generation experienced it as dramatic.2. Looking at the plotted points, between which two consecutive years did under-5 mortality fall the fastest in absolute percentage points? Does the decline look linear, or does the rate of decline change over time?
3. If you replaced the 2020 value (3.7) with 1.0, what would happen to total_drop_pct? Would Rosling's headline claim become more or less dramatic?
total_drop_pct would change from 39.3 to 42.0 pp — the headline would become more dramatic. This is the standard caution about percentage-point arithmetic on rates near zero: small absolute changes at the tail look big in pp terms but represent a smaller proportional improvement than the earlier (43→22) leg, even though that earlier leg also ‘only’ moves 21 pp. Rosling's pedagogical point is that the public underestimates progress, but the same number can be told to make progress look either modest or dramatic depending on which transformation is used. Honest reporting names both.So the optimistic claim survives a careful look at the data. But that raises a follow-up question: why did things get better? It would be tempting to credit epidemiology — or medicine more broadly — for the gains. The actor-network perspective we introduced in Section 1 pushes back against that simple story.
Networks of Progress in the Modern Era
From an actor-network perspective, the health improvements of the past two centuries were not produced by epidemiology alone. They emerged from the alignment of multiple systems:
- Science and technology: Germ theory, antibiotics, vaccines, diagnostic tools, statistical methods
- Infrastructure: Clean water systems, sewage treatment, food safety regulation, cold chain logistics
- Governance and bureaucracy: Vital registration, disease surveillance, public health agencies (WHO, CDC), regulatory frameworks
- Culture and education: Literacy, scientific thinking, norms around hygiene and health-seeking behavior
- Economic development: Rising incomes, improved nutrition, housing, and working conditions
- Social movements: Sanitary reform, women's suffrage (leading to investment in maternal and child health), labor rights, anti-tobacco advocacy
When these networks operate in concert, health improves — sometimes dramatically. But this optimistic narrative, while grounded in real data, tells only part of the story. In the next section, we will ask: whose health improved, at whose expense, and what knowledge was built on the suffering of the marginalized?
Before turning to that harder question, take a moment to sit with the optimistic story on its own terms. The reflection below is your chance to test how the data you just saw lands against your prior expectations — that gap is itself an important piece of evidence.
Complicating the Narrative
In the previous two sections, we told a largely optimistic story: epidemiology emerged from ancient roots, was refined through centuries of innovation, and contributed to dramatic global health improvements. That story is true — but it is incomplete.
A critical examination of the history of epidemiology reveals a shadow history: one in which the accumulation of medical knowledge depended on the suffering, exploitation, and dehumanization of enslaved people, colonized populations, and marginalized communities. The institutions and systems that produced "progress" for some populations simultaneously produced harm for others. Understanding this is not about rejecting epidemiology; it is about understanding the discipline more honestly and building a more just practice going forward.
To make that argument carefully, we need a vocabulary. The most useful one comes from a French philosopher who never called himself an epidemiologist but who thought hard about what it means for a state to know — and govern — the bodies of its citizens.
Foucault, Biopower, and the Politics of Population Health
The French philosopher Michel Foucault (1926–1984) developed a set of concepts that are essential for thinking critically about the history of epidemiology. In The History of Sexuality, Volume 1 (1976) and his lectures at the Collège de France, Foucault introduced two interrelated ideas. The two boxes below define them in his own terms; together they describe the kind of power that any modern public health system — including ours — exercises every day.
Biopower
Foucault used the term biopower to describe a form of political power that emerged in the 18th century: the power to manage life itself at the level of entire populations. Unlike earlier forms of sovereign power (the king's right to kill or let live), biopower operates through the regulation of birth rates, mortality rates, fertility, reproduction, health, and longevity. Biopower is the "set of mechanisms through which the basic biological features of the human species became the object of a political strategy."
Biopolitics
Biopolitics refers to the specific political practices that emerge when life and population become objects of governance. Census-taking, vital statistics, public health campaigns, quarantine measures, immigration screening, and disease surveillance are all biopolitical practices — they manage populations by monitoring, categorizing, and intervening in their biological existence.
Why does Foucault matter for epidemiology? Because epidemiology is, at its core, a biopolitical science. It emerged alongside — and in service of — the modern state's need to know, count, categorize, and manage populations. This is not inherently sinister; surveillance systems save lives. But Foucault invites us to ask uncomfortable questions:
- Who gets counted? Whose deaths are recorded and whose are invisible?
- Who gets studied? Whose bodies become the raw material for scientific knowledge?
- Who benefits? Do the populations studied also benefit from the knowledge produced?
- Who decides? Who controls the categories, the surveillance systems, the research agendas?
As we will see, the answers to these questions reveal deep inequities in how epidemiological knowledge was produced and for whom.
Foucault gives us the lens. The historical work of Jim Downs gives us the cases. Where Foucault tells us what kind of power is at stake, Downs shows us where, geographically, that power was actually built — and it is not where the standard textbook says.
Maladies of Empire: Epidemiology's Colonial Origins
In Maladies of Empire: How Colonialism, Slavery, and War Transformed Medicine (2021), historian Jim Downs argues that the standard origin story of epidemiology — centered on John Snow and the Broad Street pump — obscures a deeper, more troubling history. The systematic study of disease patterns in populations did not begin in London; it was developed through the infrastructures of colonial empire, the slave trade, and military campaigns. The accordion below summarizes three of his central claims; expand each one to see the evidence Downs draws on, and notice how each item directly mirrors a 19th-century episode you already met in Section 1.
Colonial plantations created conditions that functioned as unintentional epidemiological laboratories. Large numbers of people, concentrated in defined geographic areas, under systematic observation by plantation owners and their physicians, exposed to identifiable environmental conditions — these were the conditions necessary for studying disease patterns at the population level.
Downs documents how physicians working in slave societies in the Caribbean and American South developed theories about the transmission of yellow fever, cholera, and other infectious diseases by observing patterns of illness among enslaved populations. These observations — made possible by the total control and surveillance that slavery afforded — contributed to medical knowledge that benefited free populations while doing nothing for the enslaved people whose suffering generated it.
One of Downs' most striking arguments is that the epidemiological methods typically credited to Snow were used earlier in colonial settings. British physician James McWilliam investigated a yellow fever outbreak on the island of Boa Vista (Cape Verde) and aboard the ship Eclair, interviewing more than 100 people and assembling an explanatory framework for yellow fever transmission — more than a decade before Snow's Broad Street investigation.
Similarly, Gavin Milroy drew attention to the water supply as a source of cholera in Jamaica before Snow traced cholera to the Broad Street well in London. These investigations are largely absent from standard histories of epidemiology because they took place in colonial settings, involving colonized and enslaved populations whose contributions were not valued or recorded.
Downs further argues that military campaigns — themselves instruments of colonial expansion — produced bureaucracies that collected health data on unprecedented scales. Between 1756 and 1866, colonialism, slavery, and war created administrative systems that allowed physicians to develop theories about disease causation, transmission, and prevention.
Florence Nightingale's pioneering statistical work on disease in the British military, for instance, drew heavily on data generated by Britain's colonial and military infrastructure. Her study of sanitary conditions in India depended on the colonial bureaucracy's capacity to track sickness and death across a vast colonized territory.
The central argument is that the knowledge systems we celebrate as "modern epidemiology" were built, in significant part, on the suffering of people who were enslaved, colonized, or conscripted into wars of imperial expansion. Acknowledging this is not about discrediting the knowledge itself, but about recognizing whose labor and suffering produced it.
Research as Exploitation: Case Studies
The entanglement of epidemiology with racial exploitation did not end with the colonial era. The 20th century — the very period that gave us the British Doctors' Study, the Framingham Heart Study, and the Bradford Hill criteria — also saw some of the most egregious abuses of research ethics, with lasting consequences for trust in medical institutions. The two cases below are not historical curiosities. They are the events that produced the consent frameworks every modern study still operates under, and their fallout still shapes who participates in research today.
The Tuskegee Syphilis Study (1932–1972)
In one of the most infamous episodes in the history of medical research, the United States Public Health Service conducted a 40-year study on 399 African American men with syphilis in Macon County, Alabama. The men were told they were being treated for "bad blood" but were in fact deliberately denied treatment — even after penicillin became the standard cure in the late 1940s — so that researchers could observe the natural progression of the disease.
By the time the study was exposed by whistleblower Peter Buxtun in 1972, 28 men had died directly of syphilis, 100 had died of related complications, 40 wives had been infected, and 19 children had been born with congenital syphilis.
The Tuskegee Study led directly to the National Research Act (1974) and the Belmont Report (1979), which established the foundational ethical principles of modern research: respect for persons, beneficence, and justice. In 1997, President Bill Clinton issued a formal apology to the surviving participants.
The legacy of Tuskegee is not merely historical. Research consistently shows that African Americans report lower trust in medical institutions and clinical research, and this distrust has been linked to lower rates of participation in clinical trials and, more recently, to COVID-19 vaccine hesitancy. The harm done in Tuskegee reverberates across generations.
Nutritional Experiments on Indigenous Children in Canada (1942–1952)
Documented by historian Ian Mosby (2013), these experiments involved at least 1,300 Indigenous people, approximately 1,000 of whom were children in six residential schools across Alberta, British Columbia, Manitoba, Nova Scotia, and Ontario.
Government researchers, aware that the children were already malnourished, divided them into experimental and control groups. Some received vitamin and mineral supplements; others were deliberately kept on deficient diets to serve as controls. In some cases, dental care that had previously been available was withdrawn so that researchers could observe the progression of dental disease unchecked.
No consent was sought from the children or their families. The children were already confined in residential schools — institutions designed to forcibly assimilate Indigenous peoples by removing children from their families and cultures — making any notion of voluntary participation meaningless.
These experiments were part of a broader pattern. Historian Maureen Lux documented how experimental BCG tuberculosis vaccines were tested on Cree and Nakoda Oyadebi infants in Saskatchewan in the 1930s–40s, partly because vaccines were cheaper than improving the appalling conditions on reserves and in residential schools. In Canada's racially segregated "Indian Hospitals," patients were subjected to experimental surgical and drug treatments for tuberculosis — including lung removal — while being denied standard antibiotics available to non-Indigenous patients.
In 2023, the Canadian Medical Association formally apologized to Indigenous Peoples for its role in medical racism and research misconduct since 1867.
The Tuskegee Study and the Canadian residential-school experiments are extreme cases, but they share a feature with the colonial investigations Downs describes: the harm did not stay where the research was done. Distrust seeded in Macon County in 1932 still shapes vaccine uptake today; nutritional deprivation imposed in residential schools shows up in the metabolic profiles of survivors and their descendants. To see how far that "long tail" can stretch, we turn next to one of the most striking examples in social epidemiology — a chain of causation that runs from prehistoric geology to modern cardiovascular mortality.
The Long Shadow: From Slavery to Heart Disease
One of the most striking examples of how historical injustice produces contemporary health disparities involves the Black Belt of the American South — and it begins, remarkably, in the Cretaceous period, 100 million years ago.
A 100-Million-Year Chain of Causation
During the Cretaceous period, a shallow sea covered much of what is now the southeastern United States. Over millions of years, the remains of marine organisms were compressed into a crescent-shaped band of unusually rich, dark soil stretching from eastern Mississippi through central Alabama into western Georgia. This soil — the Black Belt, named for its color — was ideal for growing cotton.
Because it was ideal for cotton, it became the region where enslaved Africans were concentrated in the largest numbers. After the Civil War, these counties retained large Black populations but experienced little economic development. Structural racism, Jim Crow laws, disinvestment, and the systematic exclusion of Black communities from economic opportunity created persistent poverty that continues to this day.
The epidemiological consequences are measurable. A landmark study by Kramer et al. (2017), published in SSM – Population Health, found that southern counties with higher concentrations of enslaved people in 1860 experienced significantly slower declines in heart disease mortality in the late 20th century. The mechanisms linking slavery to modern heart disease include persistent poverty, lower educational attainment, limited healthcare access, food deserts, environmental exposures, and the chronic stress of ongoing racial discrimination.
A 2022 study by Rebbeck in Health Equity extended this analysis, finding that Black Belt counties had significantly higher age-adjusted mortality rates (181.8 per 100,000) compared to non-Black Belt counties (171.6 per 100,000). Rebbeck argues that "geohistorical" factors — the chain from ancient geology to slavery to structural racism — represent fundamental causes of health inequity that cannot be addressed by individual-level interventions alone.
Thinking Structurally
The Black Belt example illustrates a key principle of social epidemiology: health disparities are not natural facts. They are produced by historically specific systems of power, exploitation, and exclusion. Geology created the conditions for a particular kind of agriculture; that agriculture depended on slavery; slavery created demographic patterns; those patterns were maintained by structural racism; and structural racism produces the poverty, stress, and lack of access that drive health disparities today (Krieger, 1994). No amount of individual behavior change can undo this chain without also addressing its structural roots.
Biopower Revisited: Who Gets to Be Healthy?
We have now walked through the case material: colonial investigations, plantation medicine, Tuskegee, the residential-school experiments, the Black Belt. With those examples in hand, the abstract concept we opened the section with becomes much sharper.
Returning to Foucault, we can now see how the concept of biopower illuminates the full history of epidemiology. The same systems of surveillance, classification, and population management that produced genuine health improvements also functioned as instruments of control and exploitation:
- Vital statistics counted some populations and ignored others. Indigenous deaths in residential schools were often unrecorded or attributed to "natural" causes.
- Disease surveillance in colonial settings served the health of colonizers, not colonized peoples.
- Research subjects were drawn disproportionately from populations that had no power to refuse — enslaved people, institutionalized children, prisoners, and racialized communities.
- The benefits of knowledge flowed primarily to the populations that controlled the research apparatus, while the harms fell on those who were studied.
This is not an argument against epidemiology. It is an argument for a more self-aware, ethical, and equitable epidemiology — one that asks not only "what causes disease?" but also "who benefits from this knowledge, who is harmed by the research process, and whose priorities shape the research agenda?"
That double set of questions sets up the work of the rest of this course. The reflection below asks you to do the synthesis explicitly — to put the optimistic story of Section 2 and the critical story of this section into conversation rather than treating them as rivals. The final assessment that follows will then test the full sweep of the lesson.
Section Reflection
Take a few minutes to reflect on each of the prompts below — you may write a single integrated response that addresses them together.
- Reflection: Think about the global health progress described above. Are you surprised by any of these statistics? Why do you think most people tend to underestimate how much the world has improved? What might be the consequences of this "negativity bias" for public health policy?
- Reflection: Consider the relationship between the "optimistic" narrative of Section 2 and the "critical" narrative of this section. Are they contradictory, or can they both be true simultaneously? How should epidemiologists hold both of these perspectives as they conduct research today?
1. Hippocrates' key contribution to early epidemiological thinking was:
2. What infrastructural element made John Graunt's 1662 analysis of mortality patterns possible?
3. From an Actor-Network Theory perspective, John Snow's Broad Street pump investigation succeeded because of:
1. The British Doctors' Study by Doll and Hill was historically significant because it:
2. According to Hans Rosling, a key reason people systematically underestimate global health progress is:
3. Which of the following best reflects the actor-network perspective on health improvements?
1. Foucault's concept of "biopower" refers to:
2. Jim Downs argues in Maladies of Empire that:
3. The connection between the Black Belt region and contemporary health disparities illustrates:
4. The Tuskegee Syphilis Study (1932–1972) directly led to which major development in research ethics?
Ways of Knowing
⏱ Estimated reading time: ~40 minutes
Introduction and Overview
Section 1 ended with a question that has been waiting for us: if epidemiology was built by some communities and on the suffering of others, then what counts as knowledge in the first place? That is the question Section 2 is about. Before we ask whether a study is well-designed, we have to ask what it is trying to do, what kind of reality it assumes, and whose values it serves. This first section gives you the vocabulary to do that — three philosophical pillars, then five research paradigms that get built out of them, and finally a frank look at where epidemiology itself sits on that map.
Section 1 placed epidemiology firmly in the post-positivist tradition. That is not a neutral starting point — it is a paradigm with real strengths and real blind spots. This section walks through both halves of that ledger. We will start with where positivist quantitative science came from and what it does well, dip briefly into R to feel what quantitative summary actually looks like, and then turn to four critiques that any honest practitioner has to grapple with. By the end of the section, the goal is not for you to abandon quantitative science — it is for you to be able to defend it and see past it.
Section 2 ended with a problem: quantitative science cannot, on its own, see context, meaning, standpoint, or the categories it inherits. Section 3 is the constructive answer to that problem. We will move outward from the discipline of epidemiology in widening circles — first to qualitative research traditions that share many of epidemiology's institutional homes, then to Indigenous knowledge systems that are older than any of them, then to lived experience as evidence in its own right. The section closes by asking how all of these can sit alongside quantitative methods in a single research project, and what ethical frameworks — Two-Eyed Seeing, OCAP, CBPR — make that integration possible without reproducing the harms Lesson 1 documented.
Learning Objectives
- Use epistemology, ontology, and axiology to distinguish among the major research paradigms (positivism, post-positivism, constructivism, critical theory, pragmatism) and locate epidemiology within them.
- Articulate the strengths of Western quantitative science — objectivity, reproducibility, generalizability — and the conditions under which they hold.
- Explain key critiques of positivist science, including reductionism, context-stripping, the “view from nowhere,” and the reproduction of power structures.
- Describe qualitative research traditions, Indigenous knowledge systems, and lived experience as valid ways of knowing, including the principle of Two-Eyed Seeing (Etuaptmumk).
- Identify ethical frameworks for integrative, community-driven research, including OCAP® principles and community-based participatory research (CBPR).
Before We Begin: What Is a “Way of Knowing”?
Every time you read a research paper, listen to an Elder share traditional teachings, or reflect on your own life experience, you are engaging with a way of knowing. But not all ways of knowing operate by the same rules. Some prioritize measurement and replication; others prioritize relationship and story. Some insist that the researcher remain detached; others hold that detachment is itself a value-laden choice.
To navigate this landscape, we need a vocabulary for talking about how knowledge is produced, what it assumes about reality, and whose values it serves. This vocabulary comes from three branches of philosophy that underpin every research paradigm: epistemology, ontology, and axiology.
The Three Pillars
Three branches of philosophy — epistemology, ontology, and axiology — sit underneath every research paradigm you will ever read about. Click each card below to see the question it asks and how that question shows up inside epidemiology. As you read them, notice that the three are linked: a position on what counts as real (ontology) usually implies a position on how we can know it (epistemology), which in turn implies a stance on what role values should play (axiology).
Why These Pillars Matter
Together, epistemology, ontology, and axiology form the foundation of a research paradigm — a worldview that shapes every decision a researcher makes, from what questions they ask to what methods they use to what counts as a valid answer. When researchers disagree, the disagreement often runs deeper than methodology; it is a disagreement about the very nature of reality, knowledge, and values.
Five Research Paradigms
The three pillars are useful, but most working researchers don't pick their epistemology, ontology, and axiology from a menu. They inherit a package — a paradigm — in which all three are already specified. The next step is to meet the five packages most relevant to public health.
Thomas Kuhn (1962) introduced the concept of a “paradigm” to describe the shared framework of assumptions, values, and methods that define a scientific community at any given time. In the social and health sciences, several paradigms coexist, each offering a different lens on the world. The tabs below summarize five of them in the same four-line format — ontology, epistemology, axiology, and a public-health example — so you can compare them directly. Click through each tab and watch how the example research question changes as the underlying assumptions shift.
Ontology: There is a single, objective reality that exists independently of human perception.
Epistemology: Knowledge is obtained through direct observation and measurement. Only that which can be empirically verified is considered true knowledge.
Axiology: Research is (or should be) value-free. The researcher is a detached observer.
Methods: Experiments, surveys, statistical analysis. The goal is to discover universal laws.
Example: A randomized controlled trial measuring the effect of a drug on blood pressure. The researcher assumes the drug either works or it does not (objective reality), measures the outcome with standardized instruments, and aims to eliminate bias.
Ontology: An objective reality exists, but our understanding of it is always imperfect and fallible (critical realism).
Epistemology: Knowledge is conjectural — we can never fully prove a theory, only fail to disprove it (falsificationism). Multiple methods and triangulation improve approximation.
Axiology: Complete objectivity is an ideal we strive toward but can never fully achieve. Bias must be acknowledged and minimized.
Methods: Modified experiments, quasi-experiments, systematic reviews, meta-analyses. The goal is to get as close to truth as possible while acknowledging limitations.
Example: Most modern epidemiology operates from a post-positivist stance — acknowledging confounding, bias, and measurement error while still aiming for causal inference.
Ontology: Reality is socially constructed. Multiple realities exist, shaped by culture, context, and individual experience (relativism).
Epistemology: Knowledge is co-created between the researcher and participants. Understanding is sought, not prediction or control.
Axiology: Values are integral to the research process. The researcher’s positionality shapes every aspect of the inquiry.
Methods: Interviews, focus groups, ethnography, narrative analysis. The goal is rich, contextual understanding of meaning.
Example: A phenomenological study exploring how people living with HIV experience stigma in healthcare settings. The researcher seeks to understand the lived meaning of stigma, not to measure its prevalence.
Ontology: Reality is shaped by social, political, economic, and historical forces. What is taken as “natural” or “normal” often reflects power structures (historical realism).
Epistemology: Knowledge is never neutral — it is produced within power relations. The goal of inquiry is emancipation and social transformation.
Axiology: Research should explicitly serve social justice. Whose interests does research serve? Whose voices are centered or marginalized?
Methods: Participatory action research, critical discourse analysis, community-based methods. The goal is to change oppressive conditions, not merely describe them.
Example: A community-based participatory research project in which residents of a neighbourhood affected by environmental racism co-design a study of pollution exposure and use the findings to advocate for policy change.
Ontology: Reality is what works in practice. The debate between objectivism and relativism is less important than solving real-world problems.
Epistemology: Knowledge is judged by its practical consequences. The best method is whichever method best answers the question at hand.
Axiology: Values are important insofar as they orient research toward useful outcomes. The research question drives everything.
Methods: Mixed methods, combining quantitative and qualitative approaches. The goal is actionable knowledge.
Example: A mixed-methods evaluation of a harm-reduction program that uses survey data to measure health outcomes and interviews to understand participants’ experiences, combining both to inform program improvement.
If you scanned all five tabs, you may have noticed that the same public-health topic could be studied from any of them, and each would produce a legitimate but very different kind of knowledge. That observation sets up the obvious follow-up: where, on this map, is epidemiology?
Where Does Epidemiology Fit?
Epidemiology has its roots firmly in the positivist tradition — the idea that disease has causes that can be identified through systematic observation and measurement. Most practising epidemiologists today operate from a post-positivist perspective: they believe in an objective reality but acknowledge that their methods are imperfect and that bias, confounding, and chance can distort findings.
A Key Insight
The dominance of positivism and post-positivism in epidemiology is not because these paradigms are inherently “better” — it is a historical and sociological fact. The paradigm a discipline adopts reflects the values, power structures, and intellectual traditions of the society in which it developed. Recognizing this is the first step toward understanding why other ways of knowing are equally valuable and why epidemiology can be enriched by engaging with them.
That last point is the bridge into Section 2. Before we look at other ways of knowing in Section 3, we need to look honestly at the paradigm epidemiology already uses — what it does superbly, and where it falls short. The takeaways below summarize what you should be able to articulate before moving on; the knowledge check then gives you a quick chance to test that understanding in your own words.
Key Takeaways
- Epistemology (how we know), ontology (what is real), and axiology (what values guide us) are the three pillars of every research paradigm.
- Five major paradigms — positivism, post-positivism, constructivism, critical theory, and pragmatism — offer different lenses for understanding reality and producing knowledge.
- Epidemiology is primarily post-positivist, but this is a historical choice, not a necessary one.
- Understanding paradigms helps us see that disagreements between researchers are often not about data, but about deeper assumptions regarding reality, knowledge, and values.
The Rise of Positivism and Empiricism
The Enlightenment of the 17th and 18th centuries produced a powerful idea: that human reason and systematic observation could replace tradition, superstition, and religious authority as the basis for understanding the world. Auguste Comte coined the term positivism in the 19th century to describe a philosophy in which the only authentic knowledge is scientific knowledge, obtained through observation and experiment.
In public health, this tradition produced extraordinary achievements. Epidemiology’s reliance on systematic data collection, standardized measurement, and statistical analysis has enabled researchers to identify causes of disease, evaluate interventions, and guide policy at a population level. The eradication of smallpox, the decline of cholera, and the identification of tobacco as a cause of lung cancer are all triumphs of quantitative, empirical science.
Those triumphs are not accidents. They follow from four properties that quantitative methods deliver especially well, and that the next subsection unpacks one card at a time.
Strengths of Quantitative Science
Click through the four cards below. Each names a property that quantitative methods deliver well; as you read, notice how each one addresses a specific weakness of unaided human judgment — the inconsistency of personal observation, the difficulty of generalizing from a single case, the trouble of distinguishing real effects from chance.
Standardization Click to explore
What you'll do: build a 10-row dataset of blood pressure and smoking status, compute the standard centre-and-spread summaries, and compare smokers to non-smokers. What to take away: the four strengths you just read about are not abstractions — they collapse, in practice, to about six lines of code that anyone with the same data can rerun and get the same answer.
The strengths above — standardisation, reproducibility, generalisability — have a concrete pay-off: anyone with the same data can compute the same numbers. R makes that easy. The snippet below builds a tiny dataset of 10 patients and produces the kinds of summaries quantitative epidemiology relies on.
# A toy dataset: 10 adults' systolic blood pressure (mmHg) and smoking status
sbp <- c(118, 132, 145, 128, 155, 120, 142, 138, 125, 160)
smoker <- c("no", "yes", "yes", "no", "yes", "no", "yes", "no", "no", "yes")
# Centre & spread — the standard quantitative summary.
mean(sbp) # average
median(sbp) # middle value
sd(sbp) # standard deviation
range(sbp) # minimum and maximum
# Group comparison: does mean SBP differ by smoking status?
tapply(sbp, smoker, mean)
What the numbers do — and don't — tell you. The smokers' mean SBP is ~17 mmHg higher than non-smokers. That is the kind of clean comparison quantitative methods do well. What the numbers cannot tell you is why — that is where qualitative ways of knowing become essential.
R Reflect on what you just ran
Use the questions below to interpret the output you produced. Look at your console / plot before answering.
1. What were the mean, median, and SD of sbp for the full 10-person sample? When the mean and median differ, what does that tell you about the shape of the distribution?
mean(sbp) = 136.3, median(sbp) = 135 (the average of the 5th and 6th values, 132 and 138), and sd(sbp) ≈ 14.0 mmHg. The mean and median are within ~1 mmHg of each other, which is the signature of a roughly symmetric, near-normal distribution. When mean > median by more than a few units in real data, that usually signals right skew (a few high outliers pulling the average up); when mean < median, left skew. With n=10 you cannot push these diagnostics very far, but the rough symmetry here is consistent with what large SBP surveys find in adult populations.2. The tapply(sbp, smoker, mean) output gave you two group means. Which group had the higher mean SBP, and by how many mmHg? Is that difference clinically meaningful?
3. Sample size matters: only 5 people are in each group. List one reason why this 17-mmHg difference is NOT yet evidence that smoking causes higher blood pressure - and one question about these 10 patients that the numbers alone cannot answer.
That “cannot tell you why” gap is not a footnote — it is where the next half of the section lives. The same techniques that make quantitative science powerful also impose specific costs, and over the past half-century several traditions have made those costs explicit.
The Critiques: What Quantitative Science Cannot See
While the strengths of quantitative science are real, they come at a cost. Several important critiques have emerged from philosophy, feminist theory, postcolonial studies, and the social sciences more broadly. The next four subsections walk through them in order: first the cost of reducing reality to measurable units, then the trouble with claiming a perspective from nowhere, then two specific ways those problems show up inside epidemiological practice.
Reductionism and Context-Stripping
To measure something, you must first reduce it to measurable units. A person’s health becomes a set of biomarkers; a community’s wellbeing becomes a mortality rate; a complex social experience becomes a survey response on a Likert scale. This process of operationalization is necessary for quantitative research, but it inevitably strips away context, meaning, and nuance.
Example: Measuring “Food Insecurity”
A national survey measures food insecurity using a standardized 18-item questionnaire. It produces reliable, comparable data across provinces. But it cannot capture the experience of an Inuit family for whom “food security” is inseparable from access to traditional lands, cultural practices of harvesting and sharing, and the intergenerational trauma of forced relocation. The number tells us that food insecurity exists, but not what it means to the people experiencing it.
Reductionism is a problem about what we measure. The next critique is a problem about who is doing the measuring — and whether the standpoint of the researcher can really be made invisible.
The “View from Nowhere”
Quantitative science aspires to objectivity — a perspective free from bias, positionality, or vested interest. The philosopher Thomas Nagel (1986) called this the “view from nowhere.” But critics argue that no such view exists. Every researcher occupies a social position — defined by race, gender, class, nationality, institutional affiliation — and this position shapes what they see as interesting, important, and normal.
Two thinkers in particular have shaped how this critique plays out in health research. The first — Donna Haraway — tells us what is wrong with the “view from nowhere.” The second — Sandra Harding — tells us what to do about it.
Donna Haraway’s Situated Knowledges
In her influential 1988 essay, feminist philosopher of science Donna Haraway (1988) argued that the claim to objectivity in science is not neutral — it is a form of what she called the “god trick,” a claim to see everything from nowhere. Haraway proposed instead the concept of situated knowledges: all knowledge is produced from a particular location, body, and perspective. Rather than pretending to be objective, researchers should be accountable for their perspective — transparent about where they stand and how that shapes what they see.
Haraway did not argue that all perspectives are equally valid or that science is meaningless. She argued that better science comes from acknowledging the partiality of every perspective and from building knowledge across multiple situated positions.
Sandra Harding’s Standpoint Theory
Philosopher Sandra Harding (1991) extended this argument through standpoint theory: people who occupy marginalized social positions — women, racialized communities, people with disabilities, Indigenous peoples — often have epistemological advantages. Because they must navigate both their own world and the dominant world, they develop a “double consciousness” that can reveal aspects of reality invisible to those in dominant positions.
In public health terms: people who experience health inequities often understand the systems producing those inequities better than the researchers who study them from the outside. Standpoint theory argues that their knowledge should not be a footnote to quantitative findings — it should be a starting point for inquiry.
How Quantitative Methods Can Reproduce Power Structures
Haraway and Harding offer a philosophical case. The next critique brings that case down to the everyday choices an epidemiologist makes — the variables they include, the comparison groups they pick, the studies their journals will publish.
Michel Foucault (1980) showed how knowledge and power are inseparable: the categories science uses to classify people (normal/abnormal, healthy/sick, compliant/non-compliant) are not neutral descriptions — they are instruments of governance. Quantitative research that uncritically adopts these categories can reinforce the very inequities it claims to study. The accordion below works through three concrete examples of how this happens; expand each one to see what the routine, technically-correct practice looks like and where the harm enters.
Epidemiologists routinely include “race” as a variable in statistical models. But race is a social construct, not a biological category. When researchers control for race without theorizing why race is associated with health outcomes (i.e., racism, not race), they can produce findings that naturalize health disparities and obscure their structural causes.
Much quantitative research on Indigenous health compares Indigenous populations to non-Indigenous benchmarks, producing a narrative of deficit, dysfunction, and disparity. Linda Tuhiwai Smith (2012) argues that this research model treats Indigenous peoples as problems to be studied rather than as communities with their own knowledge, strengths, and self-determination. The data may be accurate, but the framing reproduces colonial power relations.
The conventional “hierarchy of evidence” places randomized controlled trials (RCTs) at the top and qualitative or experiential evidence at the bottom. This hierarchy implicitly devalues knowledge that does not conform to the positivist paradigm. It can also marginalize the research questions, communities, and phenomena that do not lend themselves to experimental designs — which are disproportionately those affecting marginalized populations.
Pulling the four critiques together: quantitative science is powerful, but it works best when its practitioners can name what it leaves out. The takeaways below distill the section into the four claims you should be able to make in your own words; the reflection that follows asks you to apply those critiques to a health issue you actually care about, and the knowledge check tests the conceptual content one more time before we move on.
Key Takeaways
- Western quantitative science offers real strengths: objectivity (standardization), reproducibility, generalizability, and statistical power.
- But it also has limitations: reductionism, context-stripping, and the illusion of a “view from nowhere.”
- Haraway’s situated knowledges and Harding’s standpoint theory remind us that all knowledge is produced from somewhere, and that marginalized perspectives can reveal truths invisible to dominant ones.
- Quantitative methods can reproduce existing power structures when they uncritically adopt categories like race, frame communities through deficit narratives, or enforce a hierarchy of evidence that devalues non-positivist knowledge.
Qualitative Research Traditions
Qualitative research is not a single method — it is a family of approaches, each with its own epistemological commitments and methods. What they share is a commitment to understanding meaning, context, and experience in ways that quantitative methods cannot access. The tabs below introduce four of the most common traditions in health research. As you click through, ask yourself which one would be the right fit for the health issue you wrote about in the Section 2 reflection.
Phenomenology seeks to understand the lived experience of a phenomenon from the perspective of those who experience it. It asks: What is it like to live with chronic pain? To be a new immigrant navigating the healthcare system? To receive a cancer diagnosis?
By bracketing preconceptions and attending closely to the structures of experience, phenomenological research can reveal dimensions of health and illness that surveys and biomarkers simply cannot capture.
Grounded theory develops theory from data rather than testing pre-existing hypotheses. Researchers collect data (often through interviews), code it systematically, and iteratively build theoretical frameworks that are “grounded” in participants’ experiences.
This approach is especially valuable when existing theory is inadequate — for example, when studying the health experiences of communities that have been underrepresented in prior research.
Ethnography involves prolonged immersion in a community or setting, combining observation, interviews, and participation to produce a rich, holistic account of social and cultural life. In health research, ethnography has been used to study hospital cultures, the social dynamics of substance use, and the everyday practices of healthcare delivery.
Ethnography takes seriously the idea that health is a social and cultural phenomenon, not just a biological one.
Participatory Action Research (PAR) dissolves the boundary between researcher and researched. Community members are co-investigators who help define the research question, design the study, collect and interpret data, and take action based on findings.
PAR is aligned with critical theory: its purpose is not just to produce knowledge but to produce change. It has been used extensively in Indigenous health research, harm reduction, and community health promotion.
Phenomenology, grounded theory, ethnography, and PAR are all academic traditions: they live inside universities and journals, even if they push against the conventions of those institutions. The next step is to widen the circle further — to knowledge traditions that long predate the academy and that operate by very different rules of validation.
Indigenous Knowledge Systems
Indigenous knowledge systems represent some of the oldest and most sophisticated traditions of understanding the world. They are not a single, monolithic system but rather a diverse collection of knowledge traditions, each rooted in specific lands, languages, and cultures. Despite this diversity, many Indigenous knowledge systems share several characteristics that distinguish them from Western scientific traditions. The four cards below name those characteristics; click each to see how it contrasts with the assumptions baked into Western biomedical research.
A Note on Terminology
The term “Indigenous knowledge” is used here with respect and with the acknowledgment that no single term can capture the diversity of knowledge traditions held by the hundreds of distinct Indigenous nations across Turtle Island and beyond. We use this term in the way it is used in the academic literature, while recognizing that many Indigenous scholars prefer more specific terms rooted in their own languages and traditions.
Once you accept that Indigenous knowledge systems are valid on their own terms, an obvious question follows: how should they sit alongside Western science when both are needed to address a health problem? The most influential answer in Canadian public health comes from a single Mi'kmaw teaching.
Two-Eyed Seeing (Etuaptmumk)
One of the most powerful frameworks for integrating Indigenous and Western knowledge comes from Mi’kmaw Elder Albert Marshall (Bartlett, Marshall, & Marshall, 2012), who coined the term Etuaptmumk — Two-Eyed Seeing.
Two-Eyed Seeing
“Two-Eyed Seeing refers to learning to see from one eye with the strengths of Indigenous knowledges and ways of knowing, and from the other eye with the strengths of Western knowledges and ways of knowing, and learning to use both eyes together, for the benefit of all.”
— Albert Marshall, as cited in Bartlett, Marshall, & Marshall (2012)
Two-Eyed Seeing is not about blending two knowledge systems into one or about validating Indigenous knowledge by Western standards. It is about holding both systems side by side, respecting the integrity of each, and drawing on the strengths of both to address complex problems. It requires humility: the recognition that neither system alone can see everything.
In public health, Two-Eyed Seeing has been applied to environmental health research, mental health and addiction services, and community-based health promotion. It challenges researchers to go beyond “including” Indigenous perspectives as data points and instead to restructure the research process itself so that Indigenous knowledge shapes the questions, methods, and interpretations.
Two-Eyed Seeing is built around two specific knowledge systems. The same underlying move — treating people's own understanding as evidence rather than as data to be validated — applies more broadly, to anyone living a condition that researchers are studying from the outside.
Lived Experience as Valid Evidence
Beyond formal knowledge systems, lived experience — the firsthand knowledge that comes from directly living through a condition, situation, or social position — is increasingly recognized as a legitimate and important form of evidence. People who live with mental illness, who use drugs, who experience homelessness, or who navigate systems as racialized minorities possess knowledge that cannot be obtained any other way.
The peer support movement, patient advisory committees, and the motto “Nothing about us without us” all reflect a growing recognition that lived experience should inform research, policy, and practice — not as an anecdote to be validated by “real” evidence, but as evidence in its own right.
By this point in the section we have introduced four candidate ways of knowing alongside quantitative epidemiology. The practical question for a working researcher is how to combine them inside a single project. The standard methodological vocabulary for that is mixed methods.
Mixed Methods and Knowledge Integration
Increasingly, health researchers are turning to mixed methods designs that combine quantitative and qualitative approaches within a single study or program of research. Creswell and Plano Clark (2018) describe several models for integration:
- Convergent design: Quantitative and qualitative data are collected simultaneously and merged to provide a more complete picture.
- Explanatory sequential design: Quantitative data are collected first, then qualitative data are used to explain or elaborate on the quantitative findings.
- Exploratory sequential design: Qualitative data are collected first to develop themes or instruments that are then tested quantitatively.
Mixed methods embody a pragmatist epistemology: the research question, not a paradigmatic commitment, determines the method. When done well, they can honor multiple ways of knowing within a single project.
But "when done well" is doing a lot of work in that last sentence. As Section 1 made clear, integrating Indigenous knowledge into a research project has historically been a vehicle for extraction rather than partnership. The next subsection introduces two frameworks that are designed to make sure that does not happen again.
Frameworks for Ethical, Community-Driven Research
The two frameworks below address different parts of the same problem. OCAP gives a community legal and ethical authority over its own data; CBPR restructures the research relationship itself so that the community is a partner from the first question to the last finding. They are most powerful when used together.
OCAP Principles
The OCAP principles — Ownership, Control, Access, and Possession — were developed by the First Nations Information Governance Centre to assert First Nations jurisdiction over their own data and research. They represent a direct response to a long history of extractive research in which outsiders collected data from Indigenous communities without consent, benefit, or accountability. Each of the four principles is summarized in the table below.
| Principle | Meaning |
|---|---|
| Ownership | A community or group owns its cultural knowledge, data, and information collectively, just as an individual owns their personal information. |
| Control | First Nations peoples have the right to control all aspects of research and information management processes that affect them. |
| Access | First Nations peoples must have access to information and data about themselves and their communities, regardless of where the data are held. |
| Possession | Physical control of data. While ownership identifies the relationship, possession (or stewardship) is the mechanism by which ownership is protected. |
Community-Based Participatory Research (CBPR)
OCAP says who the data belong to. CBPR says how the research itself should be done so that the data ever ends up with the right people in the first place.
CBPR is an approach that equitably involves community members, organizational representatives, and researchers in all aspects of the research process. Partners contribute their expertise and share responsibility and ownership. CBPR is not a method in itself but a set of principles that can be applied to any methodology. The two examples below show what that looks like in practice; both are widely cited in Canadian public health and both illustrate how CBPR and Two-Eyed Seeing can reinforce each other.
Example: The Kahnawake Schools Diabetes Prevention Project
This long-running CBPR partnership between the Mohawk community of Kahnawake (near Montreal) and academic researchers co-developed a diabetes prevention program rooted in Mohawk values and knowledge. Community members helped define research questions, design interventions, interpret data, and own the findings. The project exemplifies how research can be both scientifically rigorous and community-driven.
Example: The DRUM (Diverse, Resilient, Understanding, Motivated) Project
In this CBPR project, urban Indigenous youth co-designed a mental health promotion intervention that integrated traditional cultural practices (drumming, storytelling, ceremony) with evidence-based therapeutic approaches. The project was guided by Two-Eyed Seeing: Western evaluation methods measured outcomes while Indigenous knowledge shaped the intervention itself.
Pulling the section together. We started inside the academy with qualitative traditions, widened out to Indigenous knowledge and lived experience, and then asked how all of these can be combined ethically in a single project. The takeaways below capture the five claims you should be able to make in your own words; the reflection that follows asks you to translate Two-Eyed Seeing — the section's most ambitious idea — into a concrete public health context. Take the reflection seriously: the final assessment will assume you have already worked through this kind of applied thinking.
Key Takeaways
- Qualitative traditions (phenomenology, grounded theory, ethnography, PAR) offer ways of knowing that centre meaning, context, and experience.
- Indigenous knowledge systems are holistic, relational, land-based, and intergenerational — they represent valid and sophisticated ways of understanding the world.
- Two-Eyed Seeing (Etuaptmumk) offers a framework for holding Indigenous and Western knowledge side by side, drawing on the strengths of each.
- Lived experience is a legitimate form of evidence that should inform research, policy, and practice.
- OCAP principles and CBPR provide frameworks for conducting research that is ethical, community-driven, and accountable.
Section Reflection
Take a few minutes to reflect on each of the prompts below — you may write a single integrated response that addresses them together.
- Reflection: Think of a health issue you care about. How might a purely quantitative approach to that issue miss something important? What kinds of knowledge — experiential, cultural, qualitative — might fill the gap? Write at least a few sentences explaining your reasoning.
- Reflection: Consider the concept of Two-Eyed Seeing. What might it look like to apply this framework to a public health issue in your own community? What strengths would each “eye” bring? What challenges might arise in practice?
1. Epistemology is best defined as:
2. A post-positivist researcher differs from a positivist researcher primarily in that the post-positivist:
3. Which paradigm holds that the purpose of research is not merely to describe the world but to change oppressive conditions?
1. Donna Haraway’s concept of “situated knowledges” argues that:
2. Sandra Harding’s standpoint theory suggests that marginalized groups may have epistemological advantages because:
3. A key problem with including “race” as a variable in epidemiological models without theorizing its role is that:
1. Two-Eyed Seeing (Etuaptmumk) is best described as:
2. The OCAP principles — Ownership, Control, Access, and Possession — were developed primarily to:
3. Which qualitative research tradition involves community members as co-investigators who help define questions, design studies, and take action based on findings?
4. Indigenous knowledge systems are distinct from Western scientific traditions in several ways. Which of the following is NOT a characteristic typically associated with Indigenous knowledge systems?
Research Integrity & Reform
⏱ Estimated reading time: ~60 minutes
Introduction and Overview
Section 2 closed with an uncomfortable conclusion: knowledge is produced by particular people, in particular institutions, with particular incentives, and the choice of paradigm is itself shaped by power. Section 3 follows that thread to its hardest case — what happens when those incentives go wrong. Across six sections, this lesson surveys five overlapping ways the published record becomes unreliable: outright fraud, industry-funded manufactured doubt, undetected analytical errors, replication failure, and the absence of community-led ethical infrastructure. Each section ends by handing off to the next, so that by the time you reach Section 6 you should be able to read a published study not just for its findings but for the conditions under which those findings were produced. We begin where the problem is most stark: the small number of cases in which researchers simply lie.
Section 1 covered the rare cases in which individual researchers simply fabricate or falsify data. This section turns to a more pervasive and better-organized threat to the public-health literature: the deliberate, well-funded manufacture of scientific uncertainty by industries whose products are implicated in disease. The historical record assembled by Naomi Oreskes and Erik Conway in Merchants of Doubt shows that the same playbook — first developed by tobacco in the 1950s — has been redeployed against acid rain, the ozone hole, secondhand smoke, anthropogenic climate change, opioids, and ultra-processed foods.
The point of this section is not to argue that all industry-funded research is wrong. The point is to give you a working description of the manufactured-doubt strategy, to show why epidemiology is the methodological surface that gets attacked, and to equip you to distinguish a genuine scientific dispute from a manufactured one when you read the literature.
Section 1 covered explicit fraud; Section 2 covered the manufacture of doubt around honest findings. This section sits between the two: papers that pass peer review, are written in good faith, and still mislead the field — either because of an underlying data-integrity failure that reviewers cannot detect, or because of methodological errors that produce dramatically wrong answers without anyone lying. We open with a case from the COVID-19 pandemic that exposes both problems at once.
The previous three sections surveyed three distinct ways the literature gets corrupted: outright fraud (Section 1), industry-funded manufactured doubt (Section 2), and undetected analytical errors (Section 3). This section addresses a fourth, subtler problem: even when nobody is lying and no industry is involved, the system that produces published findings has been shown, repeatedly, to produce a substantial fraction of results that other researchers cannot reproduce. The reforms surveyed in the second half of this section — preregistration, data sharing, registered reports, structural change — are best understood as responses to the crisis documented in the first half.
The previous four sections documented four ways the published record can become unreliable: explicit misconduct (Section 1), industry-funded manufactured doubt (Section 2), undetected analytical errors (Section 3), and the structural failure of replication (Section 4). The reforms surveyed at the end of Section 4 — preregistration, data sharing, registered reports, EQUATOR-network reporting guidelines — are field-wide responses to those problems. They are necessary, but they are not the only kinds of ethical framework that have emerged. This section introduces two community-developed frameworks that complement the open-science movement and, in some respects, predate it.
Learning Objectives
- Distinguish among outright misconduct (FFP), questionable research practices (QRPs), manufactured doubt, and honest analytical error, and explain how each corrupts the published record.
- Use landmark cases (Wakefield, Fujii, the tobacco strategy, Surgisphere) to recognise the recurring patterns of fraud, industry-funded doubt, and systemic vulnerabilities in peer review.
- Explain the replication crisis in epidemiology and the empirical findings (Ioannidis 2005; the Reproducibility Projects) that motivated open-science reform.
- Match the major open-science reforms — preregistration, data and code sharing, registered reports, EQUATOR/STROBE reporting guidelines — to the specific failure modes each was designed to address.
- Compare OCAP® and EGAP as community-developed transparency frameworks, articulating the harms each was designed to prevent and why most public-health research with community partners needs both.
Forms of Research Misconduct
Research misconduct undermines the foundation on which evidence-based health policy and clinical practice are built. The U.S. Office of Research Integrity defines research misconduct as fabrication, falsification, or plagiarism (FFP) in proposing, performing, or reviewing research, or in reporting research results. Beyond these core violations, a broader spectrum of questionable research practices (QRPs) can distort the scientific record even without crossing the threshold of formal misconduct. The three flip cards below define each FFP category in turn; click through them and notice that the categories are not equivalent — fabrication, falsification, and plagiarism each corrupt the record in distinct ways.
Questionable Research Practices
Fabrication, falsification, and plagiarism are the bright-line cases — rare, prosecutable, and unambiguous. The harder problem for evaluating the published literature is the much larger grey zone of practices that distort findings without ever crossing that line.
QRPs occupy a grey zone between honest error and outright misconduct. While each individual practice might seem minor, their cumulative effect on the literature can be substantial. QRPs are far more common than FFP and may contribute more to the overall distortion of the evidence base. Expand each item in the accordion below to see how the practice works in everyday research, and notice that the harm from each is structural rather than individual — the same incentive structures push many researchers toward similar shortcuts.
Researchers often face many defensible analytical choices: which covariates to include, how to handle missing data, which subgroups to examine, and which statistical tests to run. When these choices are made after seeing the data and only the most favorable result is reported, this constitutes p-hacking or the “garden of forking paths.” Simmons, Nelson, & Simonsohn (2011) demonstrated that even with a completely null effect, undisclosed flexibility can produce statistically significant results over 60% of the time.
Selective outcome reporting occurs when researchers measure multiple outcomes but only report those that achieved statistical significance or supported their hypothesis. Similarly, selective analysis reporting involves running multiple models and reporting only the one with the “best” results. Studies comparing registered protocols with published papers have found that roughly 50% of trials have at least one primary outcome that was changed, introduced, or omitted between registration and publication.
Gift authorship (granting authorship to individuals who did not meaningfully contribute), ghost authorship (omitting individuals who made significant contributions, often industry-funded writers), and coercive authorship (senior researchers demanding authorship credit) all violate the ICMJE criteria for authorship. These practices distort accountability and can conceal conflicts of interest. Surveys suggest that inappropriate authorship affects a substantial proportion of published articles across health sciences.
How Common Are QRPs in Practice?
The QRP categories defined above can feel abstract until they are measured. Entradas, Feng & Sousa (2026) surveyed 1,573 researchers at Portuguese universities across six fields of research and asked, in plain language, how often each respondent had personally engaged in twelve specific practices — without ever using the loaded term “questionable research practice” in the questionnaire. The results, published in PLOS One, are reported in an accessible summary in The Scientist, and they are quietly damning. 91% of respondents admitted to at least one QRP, and 32% admitted to six or more. The figure below shows how engagement was distributed across the twelve practices the survey asked about.
Figure: Self-reported engagement in twelve QRPs (n = 1,573)
Researchers were asked how frequently they had engaged in each practice. Bars are ordered by descending engagement (smallest “Never” segment at the top). Values are visually reconstructed from Fig 1 of Entradas et al. (2026) and may differ from the published values by ±2 percentage points; the engagement totals at the right match the percentages quoted in the source.
Source: Entradas M, Feng B, Sousa JJ (2026). The ‘shades of grey’ in research integrity — Researchers admit to questionable research practices that they do not perceive to be serious. PLOS One, 21(1): e0339056. doi:10.1371/journal.pone.0339056. Recreated for HSCI 230.
Three patterns in this figure are worth flagging explicitly, because each connects directly to a section that follows.
First, the most-admitted practices — gift authorship, citing without reading the primary source, skipping a real literature review — cluster around the manuscript-writing stage rather than data collection or analysis. These behaviours are almost invisible to peer review and are exactly the kind of thing that an EQUATOR-network reporting checklist (Section 4) or a registered report (Section 4) is designed to make harder.
Second, “developed hypotheses after seeing the results” — sometimes called HARKing (Hypothesising After the Results are Known) — sits in the middle of the chart at 46% lifetime engagement. HARKing is the textbook example of why preregistration was invented: a hypothesis written down before data collection cannot quietly rotate to match whatever pattern the data happened to produce.
Third, the gradient from top to bottom of the chart maps almost perfectly onto perceived seriousness. Entradas et al. report that researchers ranked “using a researcher’s idea without giving credit” as the most serious of the twelve breaches — and only 4% admitted to it. They ranked “including authors who had not contributed sufficiently” as least serious — and 73% admitted to it. That correlation is doing real work: the practices that have been formally codified as misconduct are rare, while the practices a research community has come to treat as normal can become almost universal without ever crossing the bright line of fraud. This is what the paper means by the “shades of grey” in its title, and it is the structural reason why the field-wide reforms in Section 4 matter as much as the individual-misconduct cases in the next two subsections.
What the survey cannot tell you
Self-report surveys of misconduct are biased downward: respondents have an incentive to under-report behaviour they consider shameful. Entradas et al. note that admission rates probably understate true prevalence. The 91% figure is therefore best read as a floor, not a ceiling — and the comparison to earlier meta-analyses (which estimated 13–34% QRP engagement) reflects, at least in part, the difficulty of getting researchers to admit to behaviours they know are problematic.
Definitions are useful, but their force comes through cases. The two case studies below were chosen deliberately. Wakefield is the textbook example of how a single falsified paper, amplified by media, can produce damage that outlives its retraction by decades. Fujii is the textbook example of how systematic fabrication can run for an entire career under conventional peer review — and how it eventually gets caught not by a whistleblower but by statistics.
Case Study: The Vaccine-Autism Controversy
Perhaps no case illustrates the lasting public health damage of research misconduct more vividly than Andrew Wakefield’s 1998 Lancet paper suggesting a link between the measles-mumps-rubella (MMR) vaccine and autism.
In 1998, Wakefield and colleagues published a case series of 12 children in The Lancet, claiming that MMR vaccination was linked to a new syndrome of autism and gastrointestinal disease. The paper received extraordinary media coverage and triggered a sharp decline in MMR vaccination rates across the UK and beyond.
Investigative journalism by Brian Deer subsequently revealed extensive data falsification: medical records showed that children’s conditions had been misrepresented, timelines between vaccination and symptom onset had been altered, and pathology results had been selectively reported. Wakefield also had undisclosed financial conflicts of interest, including funding from lawyers pursuing litigation against vaccine manufacturers, and had filed a patent for a competing single measles vaccine.
The Lancet retracted the paper in 2010, and Wakefield was struck from the UK medical register for serious professional misconduct, including subjecting children to invasive procedures (lumbar punctures and colonoscopies) without ethical approval.
Durable Public Health Consequences
Despite the retraction, the fraudulent findings produced lasting damage. Multiple large-scale epidemiological studies—including a Danish cohort of over 650,000 children (Hviid et al., 2019) and a meta-analysis of over 1.2 million children (Taylor et al., 2014)—have found no association between MMR vaccination and autism. Yet vaccine hesitancy linked to the discredited study persists, contributing to measles outbreaks decades later. This case demonstrates how a single fraudulent paper, amplified by media attention, can produce public health consequences that persist long after the science has been corrected.
Wakefield was caught by a journalist. The next case shows what it takes to catch the same kind of misconduct at scale, when no individual paper is dramatic enough to attract reporters.
Case Study: Yoshitaka Fujii
Yoshitaka Fujii, a Japanese anesthesiologist, is responsible for one of the largest known cases of data fabrication in the history of biomedical research. Over a career spanning decades, Fujii published approximately 200 papers, of which at least 183 were identified as containing fabricated data.
Suspicions arose when meta-epidemiologic analyses revealed implausibly consistent results across his clinical trials. Carlisle (2012) examined the baseline characteristics reported in Fujii’s randomized controlled trials and found distributions that were statistically near-impossible—the uniformity of results across multiple trials far exceeded what would be expected by chance. The probability of obtaining such consistent baseline characteristics honestly was calculated at less than 1 in 1033.
An institutional investigation confirmed the fabrication, and Fujii’s papers were retracted en masse. The case highlighted the failure of conventional peer review to detect systematic fabrication and spurred interest in statistical forensic methods for identifying fraudulent data.
Side by side, Wakefield and Fujii anchor the two extremes of the misconduct spectrum — one paper with enormous public-health consequences, hundreds of papers with comparatively quiet ones. The table below makes the comparison explicit and previews the techniques that Section 3 will revisit in detail.
| Feature | Wakefield Case | Fujii Case |
|---|---|---|
| Type of misconduct | Falsification + ethical violations | Extensive fabrication |
| Scale | 1 retracted paper | 183+ retracted papers |
| Detection method | Investigative journalism | Statistical forensics |
| Public health impact | Global vaccine hesitancy | Corrupted anesthesia evidence base |
| Systemic lesson | Conflicts of interest disclosure | Peer review limitations |
One thing the takeaways below should make uncomfortable: even at their worst, individual fraud cases like Wakefield and Fujii are rare. Most published findings are not produced by liars. The next section asks why the literature can still be systematically distorted — on smoking, on climate, on opioids — even when no individual researcher is committing misconduct in this strict sense.
Key Takeaways
- Research misconduct encompasses fabrication, falsification, and plagiarism, along with a range of questionable research practices
- Even a single fraudulent study can produce lasting damage to public health when amplified by media
- Statistical forensic methods can reveal implausible patterns that conventional peer review may miss
- QRPs such as p-hacking and selective reporting may cumulatively distort the literature more than rare cases of outright fraud
The Tobacco Strategy
Individual fraud (Wakefield, Fujii) is rare and usually punished. A more pervasive threat to public-health science is the deliberate, well-funded manufacture of scientific uncertainty by industries whose products are implicated in disease. Historians of science Naomi Oreskes and Erik Conway documented this pattern in their 2010 book Merchants of Doubt, which traces a single playbook applied across tobacco, acid rain, the ozone hole, DDT, secondhand smoke, and anthropogenic climate change — often by an overlapping network of physicists and industry-funded scientists.
“Doubt is our product”
The strategy was articulated explicitly in a 1969 internal memo from cigarette maker Brown & Williamson, leaked decades later: “Doubt is our product, since it is the best means of competing with the ‘body of fact’ that exists in the minds of the general public. It is also the means of establishing a controversy.” The point was never to win the science. The point was to prevent the science from looking settled long enough for regulation to be delayed by years or decades.
The Same Playbook, Different Hazards
Oreskes and Conway show that once the tobacco industry built the infrastructure of manufactured doubt — front groups, friendly journalists, in-house scientific advisors — the same machinery was rented out to other industries facing inconvenient findings. The four cards below trace the playbook's redeployment across four scientifically unrelated hazards. As you click through them, watch for the recurring features — the same individuals (Seitz, Singer), the same institutional homes (the Marshall Institute, TASSC), the same rhetorical demand for ever more certainty.
If the same individuals show up across tobacco, ozone, and climate, the obvious follow-up question is why. Why is this the surface attacked, and why does it work? The answer comes from looking carefully at what observational epidemiology can and cannot do, and where its honest limits become exploitable.
Why Epidemiology Is the Surface That Gets Attacked
This history matters specifically for epidemiologists. Industries cannot easily fabricate primary observational data — cancer registries, vital statistics, and cohort studies are run by independent investigators. What they can do is exploit the genuine epistemic features of observational research. The five tactics in the box below all share that structure: each takes a real, legitimate feature of epidemiological practice and weaponizes it.
The Methodological Attack Surface
- Flood the literature. Fund a steady stream of small, low-quality counter-studies whose only function is to be cited as “contradictory evidence” in policy debates.
- Attack the methodology. Highlight the inherent limits of observational designs — confounding, ecological fallacy, healthy-user bias, recall bias — as if these were unique to the inconvenient finding rather than features of the field.
- Fund favorable secondary analyses. Reanalyze public datasets with different model specifications, cherry-pick the analyses that show null or reversed effects, and publish those.
- Demand RCT-grade evidence where it cannot be obtained. Insist that lung cancer, climate change, or asbestosis cannot be “proven” without a randomized trial that nobody could ethically run.
- Exploit any genuine uncertainty. Small samples, wide confidence intervals, ecological designs, and replication failures are all legitimate features of frontier science. Manufactured-doubt campaigns weaponize them as proof that “the science isn’t settled.”
This is why the technical material in the rest of HSCI 230 — how to recognize confounding, how to read a forest plot, how to evaluate a study design against its question — is not merely an academic exercise. Methodological literacy is what allows a public-health professional to distinguish a genuine scientific dispute from a manufactured one.
The tobacco strategy is not, unfortunately, a closed historical chapter. The same pattern is recognisable today across at least three contemporary debates — opioids, ultra-processed foods, and vaping — the first of which is examined in detail below.
Modern Parallels
Through the 1990s and 2000s, Purdue Pharma promoted OxyContin using a small set of pre-existing studies (notably a 1980 letter to the editor by Porter & Jick (1980) describing 0.03% addiction rates among hospitalized patients with no history of addiction) as if they established that opioids prescribed for chronic non-cancer pain carried negligible addiction risk. The letter was repeatedly cited — in continuing-medical-education materials, marketing copy, and even pain-management guidelines — as evidence for a claim that its data could not support.
This was not data fabrication. It was the strategic amplification of weak evidence in a direction favorable to the sponsor, paired with attacks on the methodology of studies (including the early addiction-medicine literature) that pointed the other way. The pattern is recognizably the tobacco strategy applied to a new product class.
Industry-funded nutrition research has been shown, repeatedly, to produce conclusions more favorable to the sponsor’s products than independently-funded studies on the same questions (sugar-sweetened beverages, sodium, saturated fat). The vaping debate features a parallel structure: industry-aligned researchers emphasize harm-reduction relative to combustible cigarettes; independent researchers emphasize uptake among adolescents and uncertain long-term cardiopulmonary effects. Methodological criticisms cut both ways and are often legitimate, which is exactly what makes the manufactured-doubt strategy effective.
A Generalizable Warning
Whenever a regulated industry produces health effects that are slow, statistical, and observable mainly in cohort and case-control studies, expect a manufactured-doubt strategy. The hallmarks are: an asymmetric demand for certainty, persistent methodological criticism that targets the inconvenient finding without engaging the underlying science, networks of overlapping front groups and think tanks, and a steady supply of industry-aligned scientists who reappear across unrelated hazards.
So far the lesson has covered two distinct ways the literature gets corrupted: a small number of researchers who fabricate (Section 1), and a larger but well-organized network of actors who manufacture uncertainty without fabricating anything (this section). Section 3 turns to a third source of distortion that requires neither fraud nor industry pressure — the analytical errors that working researchers make in good faith, and the statistical-forensic tools used to detect them after the fact.
Key Takeaways
- Manufactured doubt is a documented, well-funded historical strategy — not a conspiracy theory but a record reconstructed from leaked internal documents and litigation discovery
- The strategy is generalizable: tobacco, acid rain, ozone, secondhand smoke, climate change, opioids, and the food and vaping industries have all deployed recognizably similar tactics
- Epidemiology is the methodological surface that gets attacked because primary observational data are hard to fabricate but easy to criticize
- Industry-funded research is not automatically wrong, but conflict-of-interest disclosure and independent replication are essential to interpret it
- The technical material in the rest of this course is the toolkit you need to distinguish real scientific disagreement from manufactured uncertainty
The Surgisphere Scandal
The COVID-19 pandemic created unprecedented pressure to publish rapidly, which exposed critical vulnerabilities in the peer review process. In 2020, two high-profile papers relying on data from Surgisphere (Mehra et al., 2020), a small analytics company, were published in The Lancet and the New England Journal of Medicine.
The Lancet paper reported that hydroxychloroquine was associated with increased mortality and cardiac arrhythmias in hospitalized COVID-19 patients. The NEJM paper examined cardiovascular disease and drug therapy in relation to COVID-19 mortality. Both relied on a purported global registry of over 96,000 patients from nearly 700 hospitals across six continents.
When independent researchers attempted to verify the data, critical problems emerged: Surgisphere could not provide the underlying data or identify the hospitals involved. The reported data from some countries exceeded official national case counts. The company had very few employees and no apparent capacity to manage such a massive dataset. Both papers were retracted within weeks of publication.
Lessons from Surgisphere
The Surgisphere episode exposed several systemic failures: (1) data transparency—proprietary datasets that cannot be independently verified pose serious risks; (2) peer review limitations—reviewers typically cannot access raw data and must trust authors; (3) speed vs. rigor—pandemic urgency accelerated publication timelines, reducing scrutiny; and (4) journal prestige is not a guarantee—even the most prestigious journals can be deceived.
Surgisphere was, ultimately, about data — numbers that could not be verified because the underlying records did not exist. The next set of failures is about analysis — what happens when the data are real but the way time is handled silently inflates the apparent benefit of a treatment.
Immortal Time Bias and Misclassification
Not all errors in the literature result from intentional misconduct. Immortal time bias is a common methodological error in observational pharmacoepidemiology that can produce dramatically exaggerated treatment effects. The three tabs below introduce the bias, the related problem of misclassification, and what corrected reanalyses tend to show. The take-away to carry into the rest of the lesson: an honest researcher using real data can still publish a finding that is substantively wrong.
Immortal time bias occurs when the period between cohort entry and the start of treatment is misclassified or excluded. During this “immortal” period, participants in the treatment group could not have experienced the outcome (because they had to survive long enough to receive the treatment). When this time is either excluded from analysis or incorrectly attributed to the treatment group, it artificially inflates the apparent benefit of treatment.
Suissa (2008) demonstrated this bias in studies of inhaled corticosteroids and COPD mortality. The original studies suggested a 30–40% reduction in mortality, but after correcting for immortal time bias, the apparent protective effect was substantially reduced or eliminated entirely.
Exposure misclassification can arise when treatment status is measured at a single point in time rather than updated over follow-up, or when claims databases are used as proxies for actual medication use. Time-related misclassification is particularly problematic in observational studies comparing treated and untreated groups, as it can systematically favor the treated group.
Outcome misclassification—errors in how disease events are identified—can similarly bias results. If misclassification is non-differential (equally likely in exposed and unexposed groups), it typically biases toward the null. If differential, it can bias in either direction.
Several published pharmacoepidemiological findings have been overturned or substantially attenuated through careful reanalysis. For example, early observational studies suggesting that statins reduced cancer risk were later shown to be affected by immortal time bias; time-dependent analyses found no such association. Similarly, apparent benefits of certain respiratory medications were substantially reduced when time-related biases were appropriately handled.
These corrections highlight the importance of methodological scrutiny and the value of independent replication and reanalysis in the self-correcting process of science.
Immortal time bias and misclassification corrupt the analysis itself. The next category of distortion is more subtle — the analysis is technically correct, but only some of the analyses that were actually run end up being reported.
Selective Outcome Reporting and “Spin”
Widespread inconsistencies between registered protocols and published outcomes represent a pervasive problem in clinical and epidemiological research. When researchers modify their primary outcomes, add or drop secondary outcomes, or change their analytic plan after seeing the data, the published results may not faithfully represent the study as originally designed.
Protocol–Publication Discrepancies
Systematic reviews comparing trial registrations with published papers consistently find that 40–60% of studies have at least one discrepancy in primary outcomes. “Spin”—the use of specific reporting strategies to present results as more favorable than the data support—has been documented in approximately 40% of RCTs with non-significant primary outcomes. Common spin strategies include focusing on statistically significant secondary outcomes, reporting within-group changes rather than between-group differences, and using misleading titles or conclusions.
Surgisphere, immortal time bias, and selective reporting are all problems that surface after publication. The good news is that the same fact — that a paper is now in the public record — means the research community has a shared object to interrogate. Over the past decade a small but powerful toolkit of forensic methods has been built to do exactly that.
Statistical Forensics and Error Detection
A growing toolkit of statistical forensic methods allows researchers to detect anomalies in published data without access to the raw dataset. The three cards below introduce the most widely used categories. None of these methods require insider information; all work from numbers that are already in the published paper.
Forensic tools work on individual papers. The institutional infrastructure that turns those tools into changes in the published record — retractions, corrections, public commentary — lives somewhere else.
Retraction Watch and Post-Publication Peer Review
The Retraction Watch database tracks retractions across the scientific literature, providing a public resource for identifying retracted papers and understanding the reasons for retraction. As of recent counts, the database has catalogued over 40,000 retractions. Post-publication peer review platforms—including PubPeer, journal comment sections, and social media—have become increasingly important mechanisms for identifying errors and misconduct that escaped prepublication peer review.
The Self-Correcting Ideal
Science is often described as self-correcting, but corrections can take years or decades. Papers that are eventually retracted continue to be cited long after retraction, sometimes at rates similar to non-retracted papers. The lag between error and correction highlights the need for both faster detection mechanisms and better systems for propagating corrections through the literature.
The reflection below puts the diagnostic skills from this section into practice. Treat it as rehearsal: when you encounter a striking observational finding in real practice, the questions you ask should be roughly the ones the prompt invites here. Section 4 then steps back from individual papers to ask why the field, on average, produces a steady fraction of findings that other researchers cannot reproduce.
The Replication Crisis: A Brief History
The crisis did not arrive all at once. The four short subsections below walk through it in order: a theoretical provocation that named the problem, two large-scale empirical projects that measured it in psychology and cancer biology, and the specific shape it took in epidemiology. Each step pushed the conversation from “this might be a problem” toward “this is a quantifiable feature of how we publish.”
Ioannidis (2005): The Foundational Provocation
In a now-famous PLOS Medicine paper titled “Why Most Published Research Findings Are False,” John Ioannidis argued from first principles — using prior probabilities, power, bias, and the number of teams pursuing a question — that under realistic conditions in biomedical research, the majority of published positive findings should be expected to be false. The paper was widely cited and widely contested, but it crystallized a discomfort that had been building in many fields and set the agenda for the empirical replication projects that followed.
Reproducibility Project: Psychology (2015)
Open Science Collaboration (Nosek et al., 2015)
A coordinated effort by 270 researchers attempted to replicate 100 psychology experiments published in three top-tier journals in 2008. The result: only about 36% of the replications produced statistically significant findings in the same direction as the originals, and the average effect size in the replications was roughly half that of the original studies. Even where replications were “successful” in a binary sense, effects shrank substantially.
Reproducibility Project: Cancer Biology (2017–2021)
A parallel project, led by the Center for Open Science and Science Exchange, attempted to replicate experiments from 53 high-impact cancer-biology papers. The headline numbers were even worse than psychology’s: only a fraction of originally-reported effects could be reproduced, effect sizes were on average about 85% smaller, and roughly a quarter of experiments could not even be attempted because the original methods were insufficiently described or reagents were unavailable.
The Crisis in Epidemiology Specifically
The Reproducibility Projects did most of their work in psychology and cancer biology. Epidemiology had its own version of this reckoning, often visible as discordance between large observational studies and the randomized trials that eventually tested the same hypothesis. The five rows in the table below are not an exhaustive list — they are the cases that any working epidemiologist is now expected to know.
| Observational claim | RCT result |
|---|---|
| Hormone replacement therapy reduces cardiovascular risk in post-menopausal women (Nurses’ Health Study and others, 1980s–90s) | The Women’s Health Initiative RCT (Writing Group, 2002) found increased cardiovascular and breast-cancer risk; the trial was halted early |
| Beta-carotene supplementation reduces lung-cancer incidence | The CARET and ATBC trials found increased lung-cancer incidence in supplemented smokers; both were halted early |
| Vitamin E supplementation reduces cardiovascular events | Multiple RCTs (HOPE, GISSI-Prevenzione) found no benefit; some meta-analyses suggested slight harm at high doses |
| Many nutritional epidemiology findings (single nutrients linked to single diseases) | Ioannidis (2013, 2018) catalogued the field’s replication failures and argued that the typical effect sizes claimed were biologically implausible given measurement error in dietary recall instruments |
| Hundreds of early candidate-gene association studies linking specific SNPs to disease | The shift to well-powered genome-wide association studies (GWAS) showed that the great majority of pre-2007 candidate-gene findings did not replicate |
None of these were cases of fraud. They were the cumulative product of confounding, selective reporting, publication bias, and the structural pressure to publish positive findings — the same forces Ioannidis had described in 2005.
Once the field had a measurable replication problem, it had to decide what to do about it. The reform movement that followed has been remarkably structured: each milestone in the timeline below is paired with a specific failure it was designed to address.
The Reform Movement: A Timeline
Key Milestones
- 2005 — Ioannidis, “Why Most Published Research Findings Are False” (PLOS Medicine)
- 2007 — STROBE statement for reporting observational studies in epidemiology
- 2011 — Simmons, Nelson & Simonsohn, “False-Positive Psychology,” demonstrating that undisclosed analytic flexibility can produce significance > 60% of the time under the null
- 2013 — Center for Open Science founded (Brian Nosek); Registered Reports launched at Cortex by Chris Chambers
- 2015 — Reproducibility Project: Psychology published (Science); TOP Guidelines (Transparency and Openness Promotion) released; RECORD statement published as a STROBE extension for routinely-collected health data
- 2017–2018 — ICMJE data-sharing requirements for clinical trials; preregistration becomes a default expectation in psychology and an increasingly common one in epidemiology and clinical research
- 2017–2021 — Reproducibility Project: Cancer Biology reports published
- Ongoing — The EQUATOR Network (Enhancing the QUAlity and Transparency Of health Research) maintains the umbrella registry of reporting guidelines, including CONSORT (RCTs), STROBE (observational), PRISMA (systematic reviews), RECORD (routinely-collected data), STARD (diagnostic accuracy), and dozens more
The remainder of this section covers the four concrete reforms that constitute the field’s practical response: preregistration, data and code sharing, registered reports, and a frank discussion of the structural barriers that limit how far any of these can go.
Preregistration
Preregistration involves publicly specifying the research hypotheses, study design, and analysis plan before data are collected or analyzed. By creating a time-stamped, public record of the intended analysis, preregistration allows the scientific community to distinguish confirmatory (hypothesis-testing) analyses from exploratory (hypothesis-generating) analyses. The three tabs below cover where preregistration happens in practice, what it accomplishes, and — just as importantly — what it does not accomplish, so that you do not over-claim for it.
The Open Science Framework (OSF) is a free, open platform that supports preregistration for any type of study. Researchers can create detailed preregistration documents that are time-stamped and can be made public immediately or after an embargo period. ClinicalTrials.gov has mandated registration of clinical trials since 2005 (ICMJE policy) and 2007 (FDAAA), requiring registration of interventional studies before enrollment of the first participant.
Despite these requirements, compliance remains imperfect. Studies have found that a substantial proportion of published trials either were not registered or were registered retrospectively (after data collection had begun).
Preregistration reduces selective reporting by creating a public benchmark against which the published analysis can be compared. It discourages p-hacking by committing the researcher to a specific analytic strategy. It also makes deviations from the original plan transparent—researchers can still conduct exploratory analyses, but these must be clearly labelled as such.
Evidence suggests that preregistered studies are more likely to report null or negative results, consistent with reduced publication bias and selective reporting.
Preregistration is a valuable tool but not a panacea. It does not prevent data fabrication, nor does it guarantee that the registered analysis plan is appropriate. Vague or poorly specified preregistrations offer limited protection against analytic flexibility. Additionally, preregistration is more straightforward for confirmatory research; for exploratory or qualitative research, the approach requires adaptation.
Preregistration disciplines the analysis you intend to do; data and code sharing makes the analysis you actually did inspectable by others. The two reforms work as a pair.
Data and Code Sharing
Making datasets and analytic code publicly available enables other researchers to reproduce the original findings, check for errors, and conduct novel analyses. This transparency is a cornerstone of open science and a key mechanism for scientific self-correction.
When shared datasets have been reanalyzed, errors have frequently been identified. In some cases, reanalysis has confirmed the original findings but revealed computational errors that did not change the conclusions. In other cases, reanalyses have found substantive errors that changed the interpretation entirely. For example, Herndon, Ash, and Pollin (2014) reanalyzed the influential Reinhart and Rogoff (2010) economics dataset after obtaining the original spreadsheet, discovering both a coding error and selective exclusion of available data that substantially altered the conclusions.
Sharing code and data is a commitment that has to be supported by everyday practice. Three concrete tools have become standard for making sharing actually reproducible; expand each item below to see what each one buys you.
Reproducible research practices involve documenting every step of the analytical process so that another researcher (or a future version of yourself) can reproduce the same results from the same data. This includes writing clean, well-commented code; using relative file paths; documenting software versions and dependencies; and avoiding manual data manipulation steps that cannot be replicated.
Version control systems such as Git track every change made to code and documents over time. Platforms like GitHub and GitLab provide repositories where entire analysis histories can be stored, shared, and reviewed. Version control creates a transparent audit trail, allows collaboration without conflicts, and makes it possible to reproduce any previous version of an analysis.
Computational notebooks (such as R Markdown, Jupyter Notebooks, and Quarto documents) combine narrative text, code, and results in a single document. When executed, the code runs from the data and produces the figures, tables, and statistics reported in the manuscript. This “literate programming” approach makes the connection between data, code, and results explicit and reproducible.
Preregistration and code-sharing change what individual researchers do. Registered reports go further — they change what journals commit to.
Registered Reports
Registered reports represent a fundamental rethinking of the publication process. Under this model, peer review occurs before data collection. Researchers submit their introduction, methods, and analysis plan for review. If accepted at this stage, the journal provides an in-principle acceptance—a commitment to publish the final paper regardless of whether the results support the hypothesis.
How Registered Reports Reduce Publication Bias
Because the publication decision is made before results are known, registered reports eliminate the incentive to produce positive or significant findings. This directly addresses publication bias—the tendency for journals to preferentially publish positive results—and removes the motivation for p-hacking, HARKing (Hypothesizing After Results are Known), and outcome switching. Studies published as registered reports have been shown to report a substantially higher proportion of null results compared to standard publications, consistent with reduced bias.
Structural Barriers to Open Science
The reforms above describe the open-science ideal. The next subsection is a deliberate counterweight: it is also true that open science is harder to live by than to praise, and that the difficulties fall unevenly across researchers and communities. The three cards below name the most consequential barriers; in public health specifically, none of them are minor.
The barriers above point to the same conclusion: no individual researcher, however virtuous, can fix the replication problem alone. Integrity has to be built into the system around them — which sets up the closing argument of this section.
Publication Integrity as Systemic Responsibility
Research integrity is not solely the responsibility of individual researchers. It is a systemic property that depends on the actions and incentives created by multiple actors in the research ecosystem. The table below maps the five stakeholder groups that share that responsibility. As you read it, notice that no single row is dispensable — if any of these groups defects, the others' contributions are weakened.
| Stakeholder | Role in Publication Integrity |
|---|---|
| Researchers | Honest data collection, transparent reporting, preregistration, data sharing, appropriate authorship practices |
| Journals | Rigorous peer review, open-data policies, registered reports, timely correction and retraction, conflict of interest disclosure |
| Funders | Requiring registration and data sharing, funding replication studies, supporting open-access publication, rewarding transparency over novelty |
| Institutions | Training in research ethics, promotion criteria that value rigor over publication count, robust misconduct investigation processes |
| Regulators | Enforcing registration requirements, mandating data availability, protecting whistleblowers |
Incentive Structures Matter
When researchers are evaluated primarily on publication count, journal impact factor, and grant income, the incentive structure rewards quantity and novelty over rigor and transparency. Reforming academic incentive structures—through initiatives like DORA (Declaration on Research Assessment)—is essential for sustainable improvements in publication integrity. Open science should be rewarded, not penalized, in hiring and promotion decisions.
Section 4 has surveyed reform from the inside of the open-science movement. The reflection below is the section's exit ticket — a chance to commit, in your own words, to a structural change you would prioritize. Section 5 then introduces two community-led ethical frameworks that were built outside that movement and address harms it does not fully address.
Two Frameworks, One Structural Insight
Ethical conduct of research is not only about avoiding lying. It is also about who controls the data, who benefits from the findings, and what discipline is brought to the analytic process before results are seen. The two frameworks introduced here — OCAP® and EGAP — both restructure the conditions under which research is done. Each emerged from a research community responding to specific, documented harms in their own field.
- OCAP® — the First Nations Principles of Ownership, Control, Access, and Possession, developed in response to a long history of extractive research on Indigenous peoples.
- EGAP — Evidence in Governance and Politics, a researcher-led network whose pre-analysis-plan registry, design-declaration tools, and methods guides made design transparency operational well before mainstream adoption.
OCAP®: Indigenous Data Sovereignty as a Research-Ethics Framework
The First Nations Principles of OCAP® were articulated in the late 1990s by what is now the First Nations Information Governance Centre (FNIGC) and were developed in direct response to centuries of research that treated Indigenous communities as objects of study rather than partners. The acronym is a registered trademark of the FNIGC, signalling that the principles themselves are owned by First Nations — they are not a checklist that an outside researcher can adopt unilaterally. The four cards below define each letter in turn; click through them and notice that the principles are not interchangeable — ownership is a question of right, possession is a question of physical custody, and the difference between the two is exactly where harms have historically entered.
OCAP® is not abstract. In Canada it is operationalized through the FNIGC’s OCAP® training and certification, through research agreements negotiated between communities and investigators, and through provisions in the Tri-Council Policy Statement on the Ethical Conduct of Research Involving Humans, particularly TCPS 2 Chapter 9 (“Research Involving the First Nations, Inuit and Métis Peoples of Canada”). Internationally, parallel frameworks include the global CARE Principles (Collective benefit, Authority to control, Responsibility, Ethics) for Indigenous Data Governance and Maori Data Sovereignty Network principles in Aotearoa/New Zealand. These are not interchangeable with OCAP®, but they share its core insight.
OCAP® was not articulated in the abstract. It crystallized as a response to a series of documented harms in which biological samples and survey data collected with limited consent were used, decades later, for purposes the originating communities had never agreed to. Two of those harms are foundational North-American teaching cases.
The Harm OCAP® Responds To
In 1989, researchers at Arizona State University began collecting blood samples from members of the Havasupai Tribe, a small community living in the Grand Canyon experiencing high rates of type 2 diabetes. Members consented to research on diabetes. Over the following decade, however, the same samples were used — without further consent — for studies of schizophrenia, of consanguinity (“inbreeding”), and of population-genetic migration history. The migration findings were published as supporting the Bering Strait migration theory, directly contradicting the Havasupai’s own origin narratives.
Tribe members learned of the unauthorized uses largely by accident, when a graduate student presented the findings at a public lecture attended by a Havasupai member. The tribe sued. After more than seven years of litigation, the Arizona Board of Regents settled in 2010 for US$700,000, returned the remaining samples for ceremonial reburial, and accepted a list of remediation conditions.
The Havasupai case is now the foundational North American teaching case for ethical research with Indigenous communities. Its lesson is not that the diabetes research was wrong; the lesson is that consent obtained for one purpose does not transfer to other purposes, and that biological samples and the data derived from them carry community-level meaning that no individual consent form can fully address.
In the early 1980s, geneticist Richard Ward, then at the University of British Columbia, collected approximately 880 blood samples from members of the Nuu-chah-nulth Nation on the west coast of Vancouver Island. The stated purpose was research on the high prevalence of rheumatoid arthritis in the community. No useful arthritis genetics emerged.
When Ward moved to the University of Utah and later to Oxford, the samples moved with him — and were used over the next two decades by collaborators around the world for unrelated research, including studies of HIV evolution and ancient human migration patterns. None of these uses had been disclosed to or approved by the community. The Nuu-chah-nulth learned of the unauthorized research only when a researcher unconnected to Ward contacted them seeking consent for further studies of stored samples that the community had not known still existed.
The case became a central reference point in the development of TCPS 2 Chapter 9 and in the broader articulation of OCAP®. It illustrates the “possession” principle in concrete form: once samples leave a community’s physical control, the community’s ability to assert ownership is reduced to a legal and political negotiation with outsiders, often across decades and jurisdictions.
The harms above describe what OCAP® was built against. The next box describes what OCAP® looks like when it is taken seriously from the outset of a study, in a working public-health surveillance system that has run for nearly three decades.
OCAP® Applied: The First Nations Regional Health Survey
A Working Example, Not a Hypothetical
The First Nations Regional Health Survey (RHS), operated by FNIGC since 1997, is the first and only national health survey in Canada designed and managed entirely by First Nations. Communities decide which questions are asked, hold the data, approve every analysis before publication, and benefit directly from the findings through community-level reports. The RHS demonstrates that OCAP® is not a barrier to public-health research — it is a model for how public-health research can produce findings that the communities studied actually trust and use. Public-health datasets that violate OCAP® principles produce a familiar pattern: low community participation, distrust of subsequent researchers, and findings that are challenged or ignored even when methodologically sound.
OCAP® addresses who the data and findings belong to. EGAP, the second framework in this section, addresses how the analysis is committed to before any data are seen. The two come from very different research communities — First Nations governance bodies on one hand, field-experimental social scientists on the other — but they share the structural insight that ethical research is something a community builds, not something individual virtue alone can deliver.
EGAP: Designing Transparency into the Research Process
EGAP — Evidence in Governance and Politics — is a researcher-led network founded in 2009 around field experiments in political science and policy evaluation. It is mentioned here because it is one of the clearest examples of a research community building its own ethical infrastructure: a community-maintained pre-analysis-plan registry (later integrated with the OSF Registries), a series of Methods Guides codifying best practices, the DeclareDesign tooling for formal design transparency, and norms for data and code sharing well before these became journal-mandated.
EGAP’s relevance to public-health methodology is direct. Field experiments and policy evaluations in global health, behavioural intervention trials, and community-based participatory research face exactly the kind of analytic-flexibility risks (covered in Section 3) that EGAP was built to discipline. The same logic applies to large observational and natural-experiment studies in epidemiology. The three tabs below explain the tools EGAP built, what a pre-analysis plan in this tradition actually contains, and how EGAP and OCAP® relate to one another in practice.
EGAP’s most influential outputs are not policies but shared infrastructure. The Methods Guides — written by senior researchers and openly licensed — codified concrete practices for randomization, attrition handling, spillover detection, and ethical design. The PAP registry, hosted at EGAP and later mirrored on OSF, became the field-standard place to time-stamp an analysis plan before fielding a study. The DeclareDesign R package allowed researchers to formally declare a study’s data-generating process, sampling, and analysis as code, and then simulate the design’s properties before any data are collected.
A Pre-Analysis Plan (PAP) in the EGAP tradition is more than a registration of hypotheses. It is a written, time-stamped commitment to a specific operationalization of every analytic decision: which estimator, which standard errors, which covariate adjustments, which subgroup analyses, what counts as a non-finding, and what the analysis will not do. By committing in advance, the researcher converts what would otherwise be undisclosed analytic flexibility (the “garden of forking paths” from Section 1) into either a planned, confirmatory analysis or an honestly labelled exploratory one. The PAP does not eliminate the right to explore data — it forces the exploration to be visible.
The two frameworks are complementary rather than competing. Both arose from research communities responding to specific harms; both are community-led rather than regulator-imposed; both restructure the conditions of research rather than merely policing the outputs. They differ in what they centre: OCAP® centres collective community sovereignty over data and findings; EGAP centres methodological transparency at the design stage. A study with Indigenous communities that pre-registers its analysis but ignores OCAP® is not ethical; a study that respects OCAP® but allows undisclosed analytic flexibility is not rigorous. Public-health research with community partners typically needs both.
As with OCAP®, EGAP-style transparency was articulated in response to specific harms. The two cases below illustrate complementary failure modes: in the first, fabrication that survived peer review and was caught only because the data were partially shared; in the second, decades of analytic flexibility passed off as discovery, exposed by the same statistical-forensic tools you met in Section 3.
The Harm EGAP-style Frameworks Respond To
In December 2014, Science published a paper by Michael LaCour (UCLA) and Donald Green (Columbia) reporting that brief, in-person canvassing conversations by gay canvassers durably changed voters’ attitudes toward same-sex marriage. The finding was politically and scientifically high-profile, widely covered in the press, and quickly cited in subsequent advocacy.
Within months, two graduate students — David Broockman (then Berkeley) and Joshua Kalla (then Berkeley) — tried to design a follow-up study using LaCour’s published methods. When they asked the survey firm LaCour had named for cost estimates, the firm replied that the protocol he described was not one they had run. Broockman and Kalla then re-examined the publicly available data and found statistical impossibilities: response distributions that were essentially identical to those of the Cooperative Campaign Analysis Project (CCAP) reference survey, suggesting the “data” had been simulated from CCAP rather than collected. They published a 26-page report. Donald Green requested retraction of the paper; Science retracted it in May 2015. LaCour’s job offer at Princeton was withdrawn.
The decisive feature of this case is that LaCour partially complied with transparency norms — he published his survey instrument, named his vendor, and posted his data. That partial transparency is what gave Broockman and Kalla the trail to follow. A study with no transparency at all would have been impossible to expose without inside-source journalism. The case is a working argument for EGAP-style design transparency: it is what allows fraud to be detected by the research community itself, not by chance.
Brian Wansink, a high-profile food psychologist at Cornell, ran a popular research program on the behavioural drivers of eating. In November 2016 he published a now-infamous blog post praising a graduate student for repeatedly reanalyzing a null dataset until “significant” results emerged — a textbook description of p-hacking. Outside researchers, including Tim van der Zee, Nick Brown, and Jordan Anaya, applied statistical forensic tools (including the GRIM test and Granularity-Related Inconsistency of Standard Deviations checks introduced in Section 3) to dozens of Wansink’s papers and identified statistical impossibilities, duplicated data, and inconsistencies between text and tables.
Cornell investigated; Wansink resigned in 2019. Eighteen of his papers were retracted and many more corrected. No fabrication of raw data has been alleged in most cases — the misconduct here is the structured analytic flexibility that EGAP-style pre-analysis plans are designed to make impossible. A pre-registered analysis plan does not eliminate the temptation to keep trying alternative analyses; it eliminates the option of doing so silently.
Why Both Matter for Public Health
The case studies show what the two frameworks were built to prevent. The table below puts them side by side so the contrast — and the complementarity — is unmissable.
| Feature | OCAP® | EGAP |
|---|---|---|
| Originating community | First Nations of Canada (FNIGC) | Field-experimental researchers in governance and political science |
| Year articulated | Late 1990s | Founded 2009 |
| Centres | Collective community sovereignty over data and findings | Pre-analytic methodological transparency |
| Concrete tools | OCAP® certification; community research agreements; FNIGC Regional Health Survey | PAP registry; Methods Guides; DeclareDesign software |
| Status | Binding when communities require it; supported by TCPS 2 Chapter 9 | Voluntary; community-enforced through citation and review norms |
| Foundational case | Havasupai (1989–2010); Nuu-chah-nulth (1980s–2000s) | LaCour & Green (2014–15); Wansink (2017–19) |
| Failure mode it addresses | Extractive, non-consented secondary use of community data | Undisclosed analytic flexibility, p-hacking, HARKing |
A Generalizable Lesson
The four sections that preceded this one made the case that misconduct, manufactured doubt, analytical error, and replication failure are best understood as structural problems — they are not solved by exhortations to individual virtue. OCAP® and EGAP make the same point in a positive form: ethical conduct is something a research community builds, through governance arrangements, time-stamped commitments, registries, and shared tools. The technical material in the rest of HSCI 230 is necessary but not sufficient. A study can be methodologically rigorous and still ethically fraught if it was conducted without community consent, without pre-registered analytic discipline, or without commitment to transparency.
The takeaways below distill the section into six claims you should be able to state in your own words. The reflection that follows is the most demanding in the lesson — it asks you to put OCAP® and EGAP to work together on a public-health study you might actually run — and the knowledge check then tests the conceptual material before Section 6 pulls everything together.
Key Takeaways
- OCAP® (Ownership, Control, Access, Possession) is a First Nations framework articulated by the FNIGC in response to documented harms from extractive research, foundationally illustrated by the Havasupai and Nuu-chah-nulth cases
- OCAP® is operationalized through community research agreements, the FNIGC’s Regional Health Survey, OCAP® certification, and TCPS 2 Chapter 9; parallel global frameworks include the CARE Principles and Maori Data Sovereignty Network principles
- EGAP (Evidence in Governance and Politics) is a researcher-led network whose pre-analysis-plan registry, Methods Guides, and DeclareDesign tooling provided early, concrete infrastructure for design transparency
- The LaCour & Green and Wansink cases show, respectively, how partial transparency enabled fraud detection by the research community and how analytic flexibility — absent EGAP-style commitments — produces a stream of unreplicable findings
- The two frameworks are complementary: a study that respects OCAP® but allows undisclosed analytic flexibility is not rigorous; a study with a pre-registered analysis plan that ignores community sovereignty is not ethical
- Both frameworks share the structural insight that ethical research is built — through governance, registries, and shared tools — not merely policed
Section Reflection
Take a few minutes to reflect on each of the prompts below — you may write a single integrated response that addresses them together.
- Reflection: Consider a scenario where a major observational study in a top-tier journal reports a dramatic protective effect of a commonly used medication against a serious disease. What specific red flags would you look for to assess whether the finding might be affected by immortal time bias, data integrity issues, or selective reporting? How would you investigate further?
- Reflection: Think about the research ecosystem in which epidemiological studies are produced. If you could implement one structural change to improve publication integrity and replicability across the field, what would it be and why? Consider the roles of researchers, journals, funders, and institutions, and the lessons of the replication crisis, in your answer.
- Reflection: Imagine you are designing a public-health study on a sensitive health outcome (mental health, substance use, infectious-disease exposure, or chronic disease) in a community that has historically been over-researched and under-served. Drawing on OCAP® and EGAP, describe two concrete decisions you would make at the design stage — one that addresses community sovereignty, and one that addresses analytic transparency. Be specific about why these decisions matter and which case study from this section informs your reasoning.
1. Which of the following best distinguishes falsification from fabrication?
2. What was a key finding of investigations into Wakefield’s 1998 Lancet paper?
3. How was Yoshitaka Fujii’s extensive data fabrication initially detected?
1. According to Oreskes and Conway’s Merchants of Doubt, what was the central goal of the “tobacco strategy”?
2. Why is epidemiology specifically the methodological surface targeted by manufactured-doubt campaigns?
3. Which feature is a hallmark of a manufactured-doubt campaign as opposed to legitimate scientific disagreement?
1. What was a central failure exposed by the Surgisphere retractions?
2. Immortal time bias in pharmacoepidemiological studies typically produces which effect?
3. Which statistical forensic method checks whether reported means are mathematically possible given the sample size?
1. What was the headline finding of the Open Science Collaboration’s Reproducibility Project: Psychology (Nosek et al., 2015)?
2. Which observational–RCT discordance is a classic example used to illustrate the replication problem in epidemiology?
3. Which of the following correctly pairs a reform with what it is intended to address?
4. What is the EQUATOR Network?
5. Why is publication integrity described as a systemic rather than individual responsibility?
1. The OCAP® principles articulated by the First Nations Information Governance Centre stand for:
2. What is the central ethical lesson of the Havasupai Tribe case for public-health research?
3. How does an EGAP-style Pre-Analysis Plan reduce the risk of p-hacking and selective reporting?
4. The LaCour & Green retraction (2015) is often cited as evidence that:
5. Which statement most accurately describes the relationship between OCAP® and EGAP?
Lesson 1: Foundations — Final Assessment
⏱ Estimated time: 20 minutes
Bringing It All Together
This lesson has taken you through the full arc of epidemiology's history — from Hippocrates' environmental theories to the colonial laboratories of empire, from the optimism of global health gains to the persistent disparities produced by structural racism. As you complete this final assessment, draw on all three sections.
The list below distills the six ideas the rest of the course will keep coming back to. Read them as a checklist: if any feel unfamiliar, jump back into the relevant section before you take the assessment, since later lessons will assume each of them as common ground.
Key Takeaways from Lesson 1
- Epidemiology is the study of disease patterns, causes, and effects in populations — it emerged over centuries through the convergence of science, technology, and governance.
- Progress in public health has been driven by networks of actors and institutions, not by individual geniuses alone.
- Global health has improved dramatically by many measures — child mortality, life expectancy, poverty — and recognizing this progress is essential for sustaining it.
- The same institutions that produced health improvements also produced exploitation: colonialism, slavery, and war were central to the development of epidemiological methods.
- Historical injustices — from the Tuskegee Study to nutritional experiments on Indigenous children — have lasting consequences for health, trust, and equity.
- A critical epidemiology asks not only what causes disease but who benefits from knowledge, who is harmed by research, and whose priorities shape the agenda.
The final reflection below asks you to translate those takeaways into a personal stance. There is no single right answer; the goal is to leave the lesson with an articulated view of your own, because the questions that follow in HSCI 230, 341, and 410 will keep pushing on it.
The companion R script r-activities/HSCI_230_Lesson_1_Foundations_of_Epidemiology.R walks through three short blocks that mirror the lesson's three sections: (A) reproducing Graunt's 1662 London life table to see how the very first epidemiologists turned counts into population-level probabilities; (B) producing a quick descriptive summary of systolic blood pressure by smoking status to feel where pure quantitative summaries stop and other ways of knowing begin; and (C) using set.seed() plus a bootstrap CI to demonstrate the simplest form of reproducible analysis.
# PART A -- Graunt's 1662 life table for London
# Of 100 people born, how many survive to each age?
age <- c(0, 6, 16, 26, 36, 46, 56, 66, 76, 80)
survivors <- c(100, 64, 40, 25, 16, 10, 6, 3, 1, 0)
graunt <- data.frame(age = age, survivors = survivors)
graunt$survival_prob <- graunt$survivors / 100
print(graunt)
plot(graunt$age, graunt$survival_prob,
type = "b", pch = 19,
xlab = "Age (years)",
ylab = "Probability of surviving from birth",
main = "Graunt's 1662 Life Table for London")
# PART B -- ways of knowing: a quantitative summary of SBP by smoking
sbp <- c(118, 132, 145, 128, 155, 120, 142, 138, 125, 160)
smoker <- c("no", "yes", "yes", "no", "yes", "no", "yes", "no", "no", "yes")
mean(sbp) # average
median(sbp) # middle value
sd(sbp) # standard deviation
range(sbp) # minimum and maximum
tapply(sbp, smoker, mean) # mean SBP by smoking status
# PART C -- reproducibility: set.seed() + a bootstrap 95% CI for the mean
set.seed(230) # lock in random draws
x <- rnorm(100, mean = 10, sd = 2) # Normal(10, 2)
boot_means <- replicate(1000, mean(sample(x, replace = TRUE)))
quantile(boot_means, c(0.025, 0.975)) # 95% bootstrap CI
sessionInfo() # record package versions
Final Reflection
Epidemiology has been both a tool for improving population health and a tool for managing and exploiting populations. As a student entering this field, how do you think epidemiologists should navigate this tension? What responsibilities do researchers have to the communities they study? How might the history you have learned today shape how you evaluate epidemiological research going forward?
1. Hippocrates introduced the terms "epidemic" and "endemic" in his treatise:
2. John Graunt's 1662 contribution to epidemiology was significant because he:
3. The SIR model, developed by Kermack and McKendrick in 1927, is used to:
4. Semmelweis's handwashing intervention was initially rejected by the medical establishment primarily because:
5. The Framingham Heart Study, begun in 1948, is best described as a:
6. Actor-Network Theory (ANT) suggests that scientific advances are best understood as:
7. Hans Rosling's Factfulness argues that recognizing global health progress is important because:
8. Foucault's concept of "biopolitics" describes:
9. In Maladies of Empire, Jim Downs argues that epidemiological investigations in colonial settings:
10. The Tuskegee Syphilis Study involved:
11. Ian Mosby's research on nutritional experiments in Canadian residential schools revealed that:
12. The connection between the Black Belt's geology and contemporary health disparities illustrates:
13. The Belmont Report (1979) established which three core principles of research ethics?
14. From Foucault's perspective, epidemiological surveillance is an instrument of biopower because it:
15. A critical approach to the history of epidemiology suggests that researchers today should: