HSCI 230 — Lesson 13

Ecological and Group-Level Studies

Evaluating Epidemiological Research

Kiffer G. Card, PhD, Faculty of Health Sciences, Simon Fraser University

Learning objectives for this lesson:

  • List the 3 major categories of variable used in ecologic models and describe their attributes
  • Describe the constructs of a linear model at the individual and group levels and constraints on estimating incidence rate ratios at the group level
  • Describe how within-group misclassification, group-level confounding, and group-level interaction can affect causal inferences
  • Describe the basis of the ecologic and atomistic fallacies
  • Identify scenarios where ecologic studies are less likely to produce cross-level inferential errors
  • Describe how to integrate individual-level studies with ecologic studies to prevent cross-level inferential errors

Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.

Section 1 of 5

Introduction & Rationale for Group-Level Studies

⏱ Estimated reading time: 45 minutes

14.1 What Are Ecologic Studies?

Ecologic studies are studies where exposure, outcome, and confounders are all measured at the group level (e.g., townships, counties, nations), but the researcher wants to make inferences about individuals. The groups serve as cluster samples of the population.

Ecologic studies can be exploratory (no direct exposure measurement, looking for associations to guide future research) or analytic (exposure factor is measured and included in the analysis). Some studies are partial ecologic—combining some individual-level variables with group-level variables, which introduces unique inferential challenges.

Key Limitation

The primary limitation of ecologic studies is that we do not know the joint distribution of risk factors and disease within groups. This ignorance about within-group associations creates the potential for severe bias when inferring to the individual level.

14.1.1 Examples of Ecologic Studies

Example 29.1: Arsenic in Idaho Groundwater

County-level data on cancer incidence and arsenic levels in groundwater were examined. After adjusting for confounders, no significant relationship was found between county-level arsenic exposure and cancer incidence. This illustrates how group-level analyses may fail to detect individual-level associations.

Example 29.2: Bladder Cancer Across US States

Bladder cancer mortality rates across US states were examined in relation to state-level predictors: smoking prevalence, health insurance coverage, UV index, and water supply type. The ecological analysis identified associations that may or may not reflect individual-level causal mechanisms.

14.1.2 Rationale for Ecologic Studies

Despite their limitations, ecologic studies are sometimes the only practical approach:

Measurement Constraints at Individual Level

Individual-level measurement of some exposures is impractical or impossible. For example, measuring historical pollution levels or dietary intake for an entire population is expensive. Group-level aggregates (e.g., county-level average pollutant concentration, regional disease prevalence) can serve as proxies.

Exposure Homogeneity Within Groups

In some situations, exposure is relatively homogeneous within groups. For instance, all residents of a region receive water from the same supply, all schoolchildren in a district receive the same curriculum-based intervention, or all patients in a clinic receive the same standard of care.

Interest in Group-Level Effects

Sometimes the research question is fundamentally about group-level phenomena: Do communities with water fluoridation have lower dental caries rates? Do nations with higher vaccination coverage have lower measles incidence? The group itself is the unit of scientific interest.

Simplicity of Analysis

Ecologic analysis is often simpler and faster than acquiring and analyzing individual-level data across many groups. However, this simplicity may hide serious methodological problems and inferential errors.

Reflection

Think of a public health issue in your community. How might you design an ecologic study to examine it? What would be your unit of analysis (e.g., neighbourhood, city, province)? What group-level variables would you measure?

Minimum 20 characters required.

✓ Reflection saved

Knowledge Check: Section 1

1. What distinguishes an ecologic study from other observational study designs?

In ecologic studies, the unit of analysis is the group (e.g., county, nation), not the individual. Measurements are aggregated or summarized at the group level.

2. What is a "partial ecologic study"?

Partial ecologic studies mix individual- and group-level measurements, which introduces unique inferential challenges.

3. Which of the following is NOT a rationale for conducting ecologic studies?

Ecologic studies cannot establish individual-level causal mechanisms because the data are aggregated. They are useful when individual-level measurement is impractical or when group-level effects are of interest.
Section 2 of 5

Types of Ecologic Variables & The Linear Model

⏱ Estimated reading time: 45 minutes

14.2 Categories of Ecologic Variables

Three major categories of variables can be used in ecologic models, each with different attributes and interpretations:

📈
Aggregate Variables
Click to explore
🌍
Environmental Variables
Click to explore
🌐
Global Variables
Click to explore

14.2.1 The Linear Model in Ecologic Studies

Ecologic studies often use linear regression to model the relationship between group-level exposure and group-level outcome:

Ecologic Linear Model
Yj = β0 + β1X1j + β2X2j + εj

Where Y is the outcome rate for group j, X1 is the exposure proportion, X2 is a confounder, and ε is the error term. The group-level incidence rate ratio (IRG) is estimated as:

Group-Level Incidence Rate Ratio (Eq. 29.1)
IRG = (β0 + β1) / β0 = 1 + β10

A major limitation of this approach is that IRG requires extrapolation to groups with 0% and 100% exposure, which may extend far beyond the range of observed data. Additionally, different group sizes may require weighted regression for valid inference.

14.2.2 Modelling Issues

Several issues arise when modelling ecologic data:

  • Correlation vs. regression: About 33% of ecologic studies use correlation coefficients instead of regression coefficients. Regression coefficients estimate the incidence rate difference, which correlation does not provide directly.
  • Standardized outcomes: Some studies use standardized mortality ratios (SMRs) rather than crude rates, which may introduce additional complexity.
  • Interaction terms: The form of interaction at the group level may differ from the individual level when using linear models at group level and logit models at individual level.

Reflection

Consider the three types of ecologic variables. For a study on the relationship between income inequality and mental health outcomes across Canadian provinces, classify each: (a) provincial median income, (b) provincial mental health policy score, (c) average winter temperature.

Minimum 20 characters required.

✓ Reflection saved

Knowledge Check: Section 2

1. Which type of ecologic variable has NO analogue at the individual level?

Global variables (e.g., population density, laws, organizational policies) are characteristics of the group that cannot be meaningfully measured for an individual.

2. In an ecologic linear regression model Yj = β0 + β1X1j + εj, what does the group-level incidence rate ratio IRG estimate?

IRG = 1 + β10, representing the ratio of the predicted rate in a fully exposed group to the rate in a fully unexposed group. This requires extrapolation beyond observed data ranges.

3. Why is using correlation coefficients rather than regression coefficients problematic in ecologic studies?

Regression coefficients provide an estimate of the incidence rate difference (IDG), which can be used to estimate the incidence rate ratio. Correlation coefficients do not provide this directly and are considered less informative.
Section 3 of 5

Inferential Errors & Sources of Ecologic Bias

⏱ Estimated reading time: 45 minutes

14.3 The Ecologic Fallacy

The ecologic fallacy is the error of assuming that a group-level association applies to individuals. A finding at the group level (e.g., exposure associated with 3x increased disease risk) does not necessarily mean this is true for individuals. This concept was formally named by Robinson (1950).

The group-level bias typically exaggerates the association away from the null, but can occasionally reverse the direction of association.

14.3.1 The Atomistic Fallacy

The atomistic fallacy is the opposite error: assuming individual-level findings apply at the group level. Populations have emergent properties not found in individuals. A classic example is herd immunity—a population-level phenomenon with no individual-level counterpart.

Key Distinction

The ecologic fallacy occurs when group-level findings are incorrectly applied to individuals. The atomistic fallacy occurs when individual-level findings are incorrectly applied to groups.

14.4 Three Sources of Ecologic Bias

14.4.1 Within-Group Misclassification (Bias)

Non-differential misclassification at the individual level biases group-level estimates AWAY from the null (opposite direction from individual-level studies). This is given by:

Effect of Misclassification on IRG (Eq. 29.3)
IRG = 1 + (IR – 1) / (Se + Sp*IR – IR)

Where IR is the true individual-level incidence rate ratio, Se is sensitivity, and Sp is specificity. The example of a school CRD study (Example 29.4) demonstrated how misclassification at the individual level inflates group-level estimates.

14.4.2 Group-Level Confounding

Group-level confounding arises from differential distribution of individual-level risk factors across groups. Critically, even factors that are NOT confounders at the individual level can cause confounding at the group level.

Controlling for extraneous risk factors in ecologic analysis generally only removes part of the bias. Example 29.5 showed confounding that produces biased IRG even when there is no confounding at the individual level.

14.4.3 Effect Modification (Interaction) by Group

When the rate difference at the individual level varies across groups, non-linearity is introduced: the linear model at group level assumes additivity, but the logit model at individual level is inherently non-linear.

Example 29.6 is striking: effect modification by group completely reversed the direction of association. The true individual-level IR was 5.0, but the ecologic IRG was 0.67, making a harmful exposure appear protective at the group level.

14.4.4 When Cross-Level Bias Is Less Likely

Conditions Minimizing Ecologic Bias

Cross-level (ecologic) bias will NOT occur if:

  • The incidence rate difference within groups is uniform across groups, AND
  • There is no correlation between group-level exposure and the rate of the outcome in the unexposed

Ecologic bias is LESS likely when:

  • There is a large observed range of exposure across groups
  • There is small within-group variance of exposure (homogeneous groups)
  • Exposure is a strong risk factor varying in prevalence across groups
  • Distribution of extraneous risk factors is similar among groups (little group-level confounding)
  • Include positive and negative health controls to strengthen ecologic evidence

Reflection

A researcher finds that countries with higher per-capita chocolate consumption have more Nobel Prize winners. They conclude that eating chocolate makes individuals smarter. Identify the inferential error being made and explain why this conclusion is problematic. What confounders might explain the group-level association?

Minimum 20 characters required.

✓ Reflection saved

Knowledge Check: Section 3

1. What is the ecologic fallacy?

The ecologic fallacy occurs when associations observed at the group level are incorrectly assumed to hold at the individual level. Robinson (1950) famously demonstrated this error.

2. How does non-differential exposure misclassification at the individual level affect ecologic study estimates?

Unlike individual-level studies where non-differential misclassification biases toward the null, in ecologic studies it biases the group-level IR and ID away from the null—the opposite direction.

3. In Example 29.6, effect modification by group caused the ecologic IRG to be 0.67 when the true individual-level IR was 5.0. What does this demonstrate?

This is a striking example of how ecologic bias from effect modification can not only distort the magnitude but completely reverse the direction of an association, making a harmful exposure appear protective at the group level.
Section 4 of 5

Reducing Bias & Non-Ecologic Group Studies

⏱ Estimated reading time: 45 minutes

14.5 Minimizing Ecologic Bias

Ecologic bias is less of a problem when certain conditions are met (see Section 3 summary). Additionally, researchers can employ specific analytical strategies:

14.5.1 Analysing Ecologic Data

  • Multilevel modelling (MLM): Combines individual-level and group-level data to distinguish individual-level effects from contextual (group-level) effects. Validates assumptions and investigates random effects.
  • Two-phase design (Wakefield & Haneuse, 2008): Links individual-level data with ecologic data using outcome-dependent sampling within groups, reducing the need for complete individual-level information.
  • Prior information: Importance of prior knowledge about within-area probabilities and contextual effects when making inferences.

14.6 Non-Ecologic Group-Level Studies

Not all studies using group-level data are ecologic studies. A critical distinction:

The Key Difference

When variables are measured at the group level AND inferences remain at the group level → NOT ecologic. The group as the aggregate-scale of interest studying how group-level characteristics (population density, policies, social environments) affect group-level outcomes.

Examples of non-ecologic group-level studies include:

  • Health promotion programs targeting communities, with outcomes measured at the community level
  • Vaccination campaigns evaluated by population-level coverage and population-level incidence
  • Organizational interventions in clinics or hospitals, with organization-level outcomes

14.6.1 The Question of Inference Level

Rose (2001) distinguished two key epidemiological questions:

  • "What is the etiology of a case?" This is an individual-level question, seeking to understand why a particular person became ill.
  • "What is the etiology of incidence?" This is a population-level question, seeking to understand why populations have different disease rates.

Both questions are important; the appropriate level of analysis depends on the research question. The atomistic fallacy arises when researchers reduce all phenomena to individual-level explanations, ignoring emergent group properties.

14.6.2 Quality of Current Ecologic Research

Dufault & Klar (2011) reviewed the reporting quality of ecologic studies and found concerning patterns:

  • Only 18% explicitly justified their choice of ecologic units
  • 97% of outcomes were aggregate in nature
  • 54% relied on fewer than 100 group-level observations
  • Only 42% adequately justified why an ecologic design was necessary
  • Most studies did not sufficiently inform readers about possible ecologic bias

Reflection

Consider a city that wants to evaluate whether its new bicycle-sharing program has reduced cardiovascular disease rates. Would an ecologic design or individual-level design be more appropriate? What are the trade-offs? How might you combine both approaches using multilevel modelling?

Minimum 20 characters required.

✓ Reflection saved

Knowledge Check: Section 4

1. Which of the following conditions makes ecologic bias LESS likely?

When the distribution of extraneous risk factors is similar across groups, there is little group-level confounding, which is one of the major sources of ecologic bias.

2. What is multilevel modelling (MLM) in the context of ecologic studies?

MLM integrates data from multiple levels (individual and group), helping to distinguish individual-level effects from group-level (contextual) effects and reducing the risk of ecologic fallacy.

3. When is a group-level study NOT considered an ecologic study?

Ecologic studies involve making inferences about individuals from group-level data. If the variables are measured at the group level and the inferences are also directed at the group level, this is a non-ecologic group-level study and does not pose the same cross-level inferential problems.
Section 5 of 5

Final Assessment

⏱ Estimated reading time: 20 minutes

Reflection

Reflecting on this lesson, describe a scenario from public health or your field of interest where an ecologic study design would be the most practical and informative approach. What safeguards would you implement to minimize the risk of the ecologic fallacy?

Minimum 20 characters required.

✓ Reflection saved
Lesson Summary: Ecological and Group-Level Studies

Section 1: Introduction to ecologic studies, their role as exploratory tools, and the rationale for group-level research including measurement constraints and interest in group-level effects.

Section 2: The three types of ecologic variables (aggregate, environmental, global) and the linear regression model used to estimate group-level associations.

Section 3: The ecologic and atomistic fallacies, and the three major sources of ecologic bias: within-group misclassification, group-level confounding, and effect modification by group.

Section 4: Strategies for minimizing ecologic bias, multilevel modelling, the two-phase design, and distinguishing ecologic from non-ecologic group-level studies.

Final Assessment: Ecological and Group-Level Studies (15 questions)

1. Ecologic studies differ from other observational designs primarily because:

The defining feature of ecologic studies is that measurements of exposure, outcome, and confounders are made at the group level.

2. A "partial ecologic study" is one where:

Partial ecologic studies combine individual-level and group-level measurements.

3. Which is an example of a global variable?

Population density is a characteristic of the group that has no meaningful analogue at the individual level.

4. Aggregate variables in ecologic studies are:

Aggregate (derived) variables are formed by summarizing individual observations (e.g., proportion exposed, mean BMI, disease rate).

5. In the ecologic linear model, the group-level incidence rate ratio IRG requires:

IRG = 1 + β10 assumes groups exist with no exposure (X=0) and complete exposure (X=1), which is usually extrapolation beyond observed data.

6. The ecologic fallacy refers to:

The ecologic fallacy occurs when associations observed at the group level are assumed to apply at the individual level.

7. The atomistic fallacy is:

The atomistic fallacy is the opposite of the ecologic fallacy—it occurs when individual-level findings are incorrectly generalized to groups.

8. Non-differential exposure misclassification in ecologic studies biases estimates:

Unlike in individual-level studies, non-differential misclassification in ecologic studies biases the group-level association away from the null.

9. Group-level confounding in ecologic studies:

Even factors that are not confounders at the individual level can create confounding at the group level if their distribution varies across groups.

10. In Example 29.6 from the textbook, effect modification by group caused:

The individual-level IR was 5.0, but the ecologic IRG was 0.67—a complete reversal caused by group-level effect modification.

11. Ecologic bias is LESS likely when:

Ecologic bias is minimized when there is a large range of exposure across groups, small within-group variance, and similar distribution of confounders across groups.

12. Multilevel modelling (MLM) helps address ecologic bias by:

MLM integrates data from multiple levels, allowing researchers to separately estimate individual-level and group-level (contextual) effects.

13. A study measuring the effect of a city's water fluoridation policy on community-level dental health (with inferences remaining at the community level) is:

When variables are measured at the group level and inferences also remain at the group level, this is a non-ecologic group-level study.

14. According to Dufault and Klar (2011), what proportion of ecologic studies adequately justified the choice of ecologic design?

Only 42% of reviewed studies adequately and explicitly justified an ecologic analysis or why the design was necessary.

15. Rose (2001) distinguished between two key epidemiological questions. Which pair correctly represents them?

Rose distinguished between studying why individuals get sick (etiology of a case—individual level) and why populations have different disease rates (etiology of incidence—population level). Both questions require different levels of analysis.

Congratulations!

You have successfully completed Lesson 13: Ecological and Group-Level Studies. You now understand the principles, strengths, and limitations of ecologic studies, including the three types of variables, sources of bias, the ecologic and atomistic fallacies, and strategies for reducing inferential errors through integration of individual and group-level data.