HSCI 341 — Lesson 6

Measures of
Association

Fundamental Epidemiological Concepts and Approaches

Kiffer G. Card, PhD, Faculty of Health Sciences, Simon Fraser University

Learning objectives for this lesson:

  • Calculate and interpret the risk ratio, incidence rate ratio, and odds ratio
  • Compute risk difference, attributable fraction (exposed), and population attributable measures
  • Understand when to use each measure of association
  • Correctly distinguish between strength of association and statistical significance
  • Understand the basis for hypothesis tests and confidence intervals

This course was developed by Kiffer G. Card, PhD, as a companion to Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.

Section 1

Introduction & Ratio Measures of Association

⏱ Estimated reading time: 15 minutes

Learning Objectives

  • Explain why measures of association are used in epidemiology.
  • Set up incidence risk and incidence rate data in 2×2 tables.
  • Calculate and interpret the risk ratio (RR), incidence rate ratio (IR), and odds ratio (OR).
  • Describe the relationships among RR, IR, and OR.

Why Measure Association?

Measures of association assess the magnitude of the relationship between an exposure (a potential cause) and a disease. Unlike measures of statistical significance, which are heavily dependent on sample size, measures of association indicate the strength of the effect — how much more (or less) likely disease is in exposed compared to non-exposed groups.

Strength vs. Significance

A measure of association tells you how strongly an exposure is linked to disease. A P-value tells you how likely the observed data would be under the null hypothesis of no association. A strong association can be non-significant (small sample), and a weak association can be highly significant (large sample). Always report both.

Data Layout

Depending on study design, disease frequency can be expressed as incidence risk, incidence rate, prevalence, or odds. For risk data, the standard 2×2 table is:

ExposedNon-exposedTotal
Diseaseda1a0m1
Non-diseasedb1b0m0
Totaln1n0n

For rate data, the denominator is person-time at risk rather than the number of individuals:

ExposedNon-exposedTotal
Number of casesa1a0m1
Person-time at riskt1t0t

Three Ratio Measures of Association

Click each card to learn more:

Risk Ratio (RR)Click to learn more
Incidence Rate Ratio (IR)Click to learn more
Odds Ratio (OR)Click to learn more

Worked Example: Brazil Water Cistern Study

Diarrhea & Water Cistern Presence

Water CisternNo CisternTotal
Diarrhea Present194303497
Diarrhea Absent1,5881,3142,902
Total1,7821,6173,399
  • RR = (194/1782) / (303/1617) = 0.109 / 0.187 = 0.58
  • OR = (194 × 1314) / (303 × 1588) = 0.53

Both measures indicate that having a water cistern is protective against diarrhea (values < 1). The RR of 0.58 means the risk is 42% lower among those with cisterns.

Worked Example: Migraine Incidence Rates

Gender and Migraine (Ages 30–40)

FemaleMaleTotal
Cases of migraine13144175
Person-months250236486

IR = (131/250) / (44/236) = 0.524 / 0.186 = 2.81

The rate of migraine is 2.81 times higher in females than males aged 30–40.

Relationships Among RR, IR, and OR

In general, IR values are further from the null (1) than RR values, and OR values are even further away. This can be visualised on a number line:

1 (null) 0 OR IR RR RR IR OR

Figure 6.1 — General relationships among RR, IR, and OR. OR is always furthest from the null value of 1.

When is OR ≈ RR?

When the disease is rare (prevalence or incidence risk < 5%), OR approximates RR. This is because when a1 is small relative to n1, the denominator of the odds (b1) is approximately equal to n1, and similarly for the non-exposed group. In the cistern example, the overall risk was 14.6%, so OR (0.53) was more extreme than RR (0.58).

When is RR ≈ IR?

RR and IR will be close to each other if the exposure has a negligible impact on the total time at risk in the study population. This occurs when the disease is rare or when IR is close to the null value (IR ≈ 1).

OR as an Estimator of IR

OR is a good estimator of IR under certain conditions in case-control studies. If controls are selected using cumulative or risk-based sampling (all non-cases after cases have occurred), then OR estimates IR only if the disease is rare. If controls are selected using density sampling (a control selected from non-cases each time a case occurs), then OR is a direct estimate of IR regardless of disease rarity.

Key Takeaways

  • Measures of association quantify the strength of the exposure-disease relationship, unlike P-values which reflect sample size.
  • RR compares risks, IR compares incidence rates, and OR compares odds between exposed and non-exposed groups.
  • OR is the only measure that can be computed from case-control studies due to its symmetry property.
  • When disease is rare (<5%), OR ≈ RR. IR values are further from the null than RR, and OR values further still.
Knowledge Check — Section 1

1. The odds ratio (OR) is the only ratio measure of association applicable to case-control studies because:

The OR exhibits symmetry: (a1×b0)/(a0×b1) is the same regardless of whether you view it as odds of disease or odds of exposure. In case-control studies, the investigator sets the number of cases and controls, making RR incalculable, but OR remains valid.

2. A risk ratio of 0.58 for diarrhea in a cistern study indicates:

An RR of 0.58 means the risk in the exposed group is 58% of the risk in the non-exposed group, which is a 42% reduction (1 − 0.58 = 0.42). Since RR < 1, the exposure (cistern) is protective.

3. Under what condition does OR best approximate RR?

When disease is rare, the number of cases (a) is small relative to the total (n), so odds and risk become approximately equal. Under density sampling, OR estimates IR regardless of rarity.

✦ Pass the knowledge check with 100% to continue

Section 2

Measures of Effect in the Exposed Group

⏱ Estimated reading time: 15 minutes

Learning Objectives

  • Distinguish between “ratio” (relative) and “difference” (absolute) measures of association.
  • Calculate and interpret the risk difference (RD) and incidence rate difference (ID).
  • Calculate and interpret the attributable fraction in the exposed (AFe).
  • Explain the concept of vaccine efficacy as a special case of AFe.

Ratio vs. Difference Measures

The ratio measures from Section 1 (RR, IR, OR) tell us the relative strength of association, but they do not indicate the absolute number of cases attributable to the exposure. Difference (absolute effect) measures address this gap by computing how many additional cases occur because of the exposure.

Why Both Matter

Even when an exposure is very strongly associated with disease (high RR), if the exposure is rare in a population, it may contribute very few cases. Conversely, a relatively weak risk factor (modest RR) that is common can be responsible for many cases. Difference measures capture this “public health impact.”

Risk Difference (RD) — Attributable Risk

The risk difference (RD), also called attributable risk, is simply the risk in the exposed group minus the risk in the non-exposed group:

RD = p(D+|E+) − p(D+|E−) = (a1/n1) − (a0/n0) Eq 6.5

Similarly, the incidence rate difference (ID) is the difference between two incidence rates:

ID = (a1/t1) − (a0/t0) Eq 6.6

Interpretation of Difference Measures

  • RD or ID < 0 → Exposure is protective
  • RD or ID = 0 → No effect of exposure
  • RD or ID > 0 → Exposure is positively associated with disease

RD indicates the increase (or decrease) in the probability of disease in the exposed group, beyond the baseline risk. It tells you: “For every X exposed individuals, how many additional cases occur because of the exposure?”

Example: Smoking & Low Birth Weight

From a cohort of 5,000 women followed through pregnancy:

SmokerNon-smokerTotal
Low birth weight40331371
Normal birth weight3114,3184,629
Total3514,6495,000
  • Risk in exposed: RE+ = 40/351 = 0.114
  • Risk in non-exposed: RE− = 331/4649 = 0.071
  • RD = 0.114 − 0.071 = 0.043

For every 100 women who smoked, approximately 4.3 had a low-birth-weight baby due to the fact that they smoked (assuming causal relationship).

Attributable Fraction in the Exposed (AFe)

The AFe expresses the proportion of disease in exposed individuals that is due to the exposure, assuming the relationship is causal. It can be viewed as the proportion of disease in the exposed group that would be avoided if the exposure were removed.

AFe = RD / p(D+|E+) = (RR − 1) / RR ≈ (OR − 1) / OR Eq 6.7

AFe ranges from 0 (where risk is equal, RR = 1) to 1 (where all disease in the exposed group is due to the exposure, RR = ∞). In case-control studies, AFe can be approximated by substituting OR for RR.

Worked Example: AFe for Smoking

From the smoking example above:

  • RR = 0.114 / 0.071 = 1.60
  • AFe = (1.60 − 1) / 1.60 = 0.60 / 1.60 = 0.375 (37.5%)

Among women who smoked, 37.5% of the low-birth-weight cases were attributable to smoking. Alternatively: 0.043 / 0.114 = 0.377 ≈ 37.7% (slight rounding difference).

Vaccine Efficacy

Vaccine efficacy is a special form of AFe, where “not vaccinated” is the exposure (factor positive) and “vaccinated” is the comparison group. If 20% of unvaccinated individuals develop disease versus 5% of vaccinated individuals:

Vaccine Efficacy Calculation

  • RD = 0.20 − 0.05 = 0.15
  • AFe = 0.15 / 0.20 = 0.75 (75%)

The vaccine has prevented 75% of the cases of disease that would have occurred in the vaccinated group if the vaccine had not been used.

AFe vs. Etiologic Fraction

The etiologic fraction is the proportion of cases in the exposed group for which exposure was a component of the sufficient cause. While AFe measures the excess fraction, the etiologic fraction can be higher because exposure may contribute to cases even when the baseline risk would have produced them eventually. In general, AFe provides a lower bound for the etiologic fraction.

Reflection

In a cohort study, a new environmental pollutant is found to have an RR of 3.0 for respiratory disease. The risk of respiratory disease in the non-exposed population is 2%. Calculate the RD and AFe. If 1,000 people are exposed, how many additional cases would you expect due to the exposure? Discuss why RD and AFe give different but complementary perspectives.

Minimum 20 characters required.

✓ Reflection saved

Key Takeaways

  • RD (attributable risk) measures the absolute increase in risk due to exposure; the null value is 0.
  • AFe = (RR − 1)/RR gives the proportion of disease in the exposed that is due to the exposure.
  • Vaccine efficacy is a special case of AFe where the “exposure” is being unvaccinated.
  • AFe provides a lower bound for the etiologic fraction.
Knowledge Check — Section 2

1. If the risk of disease is 12% in the exposed group and 4% in the non-exposed group, the risk difference (RD) is:

RD = 0.12 − 0.04 = 0.08 or 8%. This is the absolute increase in risk attributable to the exposure. (The RR would be 3.0, which is a ratio measure.)

2. A vaccine efficacy of 75% means:

Vaccine efficacy = AFe = (risk in unvaccinated − risk in vaccinated) / risk in unvaccinated. A value of 75% means 75% of cases were prevented by vaccination.

3. AFe = (RR − 1)/RR. If RR = 2.5, what is AFe?

AFe = (2.5 − 1) / 2.5 = 1.5 / 2.5 = 0.60. This means 60% of disease among the exposed is attributable to the exposure.

✦ Pass the knowledge check with 100% and complete the reflection to continue

Section 3

Population-Level Measures & Study Design

⏱ Estimated reading time: 12 minutes

Learning Objectives

  • Calculate and interpret the population attributable risk (PAR) and population attributable fraction (AFp).
  • Explain how the prevalence of exposure affects population-level measures.
  • Identify which measures of association can be computed from each study design.

From the Exposed Group to the Entire Population

While RD and AFe describe the effect of exposure among exposed individuals, public health decisions often require understanding the impact of an exposure on the entire population. Two key population-level measures address this:

Population Attributable Risk (PAR)

The PAR is the increase in overall population risk attributable to the exposure. It reflects both the strength of the association and the frequency of the exposure in the population.

PAR = p(D+) − p(D+|E−) = (m1/n) − (a0/n0) = RD × p(E+) Eq 6.8

Population Attributable Fraction (AFp)

The AFp indicates the proportion of disease in the entire population that is attributable to the exposure, and which would be avoided if the exposure were removed (assuming causation and no confounding).

AFp = PAR / p(D+) = p(E+)(RR − 1) / [p(E+)(RR − 1) + 1] Eq 6.9

Why Exposure Prevalence Matters

A strong risk factor (high RR) that is rare in the population will have a small AFp. A weaker risk factor (modest RR) that is common may have a large AFp. For example, intravenous drug use has a very high RR for HIV, but if it is rare in the population, eliminating it would prevent few total cases. A modestly elevated risk factor like poor diet, affecting millions, may account for more total cases.

Worked Example: Smoking & Low Birth Weight (Population Level)

From the cohort of 5,000 women (351 smokers, 4,649 non-smokers):

  • Overall risk: p(D+) = 371/5000 = 0.074
  • Risk in non-exposed: 331/4649 = 0.071
  • PAR = 0.074 − 0.071 = 0.003
  • AFp = 0.003 / 0.074 = 0.041 (4.1%)

Only 4.1% of all low-birth-weight babies in the population were attributable to smoking. The low AFp is because very few women (351/5000 = 7%) smoked during the 2nd trimester, despite the relatively strong association (RR = 1.60).

Confounding and AFp

If confounding is present, adjusted estimates of RR should be used. The AFp can then be estimated using:

AFp = pd × (aRR − 1) / aRR Eq 6.10

where pd is the proportion of cases exposed to the risk factor, and aRR is the adjusted risk ratio. For multiple exposure categories, a summation formula is used.

Study Design and Measures of Association

Not all measures can be computed from all study designs. The following table summarises which measures are available:

MeasureCross-sectionalCohortCase-control
RR
IR
OR
RD
AFeb
PARa
AFpac

a Requires independent estimate of p(D+) or p(E+). b Estimated using OR. c Requires OR and independent estimate of p(E+|D+).

Reflection

Consider two risk factors for a disease: Factor A has RR = 5.0 and affects 2% of the population. Factor B has RR = 1.5 and affects 40% of the population. Calculate AFp for each factor using the formula AFp = p(E+)(RR − 1) / [p(E+)(RR − 1) + 1]. Which factor would you prioritise in a public health intervention, and why?

Minimum 20 characters required.

✓ Reflection saved

Key Takeaways

  • PAR = overall population risk − risk in unexposed; it reflects both strength and prevalence of exposure.
  • AFp = PAR / p(D+); it gives the proportion of all disease in the population attributable to the exposure.
  • A common risk factor with modest RR can have a larger AFp than a rare factor with high RR.
  • Different study designs support different measures: only OR is available from case-control studies.
Knowledge Check — Section 3

1. A risk factor has RR = 4.0 but affects only 1% of the population. The AFp is:

AFp = p(E+)(RR−1) / [p(E+)(RR−1)+1] = 0.01×3 / (0.01×3+1) = 0.03/1.03 = 0.029 or about 2.9%. Despite the strong association, the low prevalence of exposure limits the population impact.

2. Which measure cannot be computed directly from a case-control study?

RD requires actual disease risks in the exposed and non-exposed groups. In case-control studies, these risks cannot be computed because the investigator determines the ratio of cases to controls.

3. PAR differs from RD in that:

PAR = p(D+) − p(D+|E−). It is the overall population-level risk increase attributable to the exposure, incorporating both the strength of association and the prevalence of exposure. RD only measures the difference between exposed and non-exposed groups.

✦ Pass the knowledge check with 100% and complete the reflection to continue

Section 4

Hypothesis Testing & Confidence Intervals

⏱ Estimated reading time: 15 minutes

Learning Objectives

  • Explain the concepts of standard error, null hypothesis, and P-value.
  • Describe the four common test statistics for evaluating associations.
  • Interpret confidence intervals for measures of association.
  • Distinguish between statistical significance and the strength of association.

Standard Error

The standard error (SE) provides a measure of the precision of a point estimate — how much uncertainty exists in the estimate. For difference measures (RD, ID), the variance can be computed directly:

var(RD) = [(a1/n1)(1 − a1/n1)] / n1 + [(a0/n0)(1 − a0/n0)] / n0 Eq 6.13

For ratio measures (RR, IR, OR), the variance is computed on the log scale using Taylor series approximations:

var(ln RR) = 1/a1 − 1/n1 + 1/a0 − 1/n0 Eq 6.14
var(ln OR) = 1/a1 + 1/a0 + 1/b1 + 1/b0 Eq 6.15

Hypothesis Testing

Significance testing is based on specifying a null hypothesis about the population parameter. The null hypothesis typically states there is no association:

  • For difference measures (RD, ID): H0: θ = 0
  • For ratio measures (RR, IR, OR): H0: θ = 1

An alternative hypothesis can be 1-tailed or 2-tailed. In general, 2-tailed hypotheses are preferred because 1-tailed hypotheses are harder to justify.

Limitations of P-values

P-values are often dichotomised into “significant” or “non-significant” at α = 0.05, but this entails a huge loss of information. A P-value of 0.049 and 0.051 lead to different conclusions despite being virtually identical. Always report the actual P-value and a confidence interval, which conveys both significance and precision.

Test Statistics

Click each card to explore:

Pearson χ²Click to explore
Exact TestsClick to explore
Wald StatisticClick to explore
Likelihood Ratio TestClick to explore

Confidence Intervals

A confidence interval (CI) reflects the level of uncertainty in a point estimate. A 95% CI means that if the study were repeated many times under identical conditions, 95% of the computed CIs would contain the true parameter value.

Computing CIs

For difference measures, the CI is computed directly:

θ ± Zα × √var(θ) Eq 6.19

For ratio measures, the CI is computed on the log scale and then exponentiated:

θ × exp(± Zα × √var(ln θ)) Eq 6.21

The CI is symmetrical about lnθ but not about θ itself — this is why confidence intervals for ratio measures appear asymmetric.

Interpreting CIs for Measures of Association

  • For RR, IR, OR: if the 95% CI includes 1, the association is not statistically significant at α = 0.05.
  • For RD, ID: if the 95% CI includes 0, the association is not statistically significant.

However, this “surrogate significance test” is an under-use of the CI. The CI also shows the range of plausible effect sizes, which is far more informative than a binary significant/non-significant classification.

Example CIs from the Textbook

MeasurePoint Estimate95% CI
RD (smoking)0.043(0.009, 0.077)
RR (smoking)1.601(1.174, 2.182)
OR (smoking)1.678(1.154, 2.387)
ID (migraine)0.338(0.232, 0.443)
IR (migraine)2.811(1.983, 4.050)

None of the CIs for ratio measures include 1, and none for difference measures include 0, confirming statistical significance for all associations.

Reflection

A study reports an OR of 1.45 with a 95% CI of (0.92, 2.28). A second study reports an OR of 1.15 with a 95% CI of (1.02, 1.30). Compare these two findings in terms of: (a) strength of association, (b) statistical significance, and (c) precision. Which finding might be more concerning from a public health perspective, and why?

Minimum 20 characters required.

✓ Reflection saved

Key Takeaways

  • Standard errors quantify precision; they are computed differently for difference vs. ratio measures.
  • Hypothesis testing uses a null hypothesis (no effect) and a test statistic to generate a P-value.
  • Four common test statistics: Pearson χ², exact tests, Wald, and likelihood ratio tests.
  • Confidence intervals are more informative than P-values: they show the range of plausible effect sizes.
  • For ratio measures, the CI containing 1 (or 0 for differences) indicates non-significance at the corresponding α level.
Knowledge Check — Section 4

1. A 95% confidence interval for an odds ratio is (1.2, 3.8). This means:

Since the CI does not include 1 (the null value for ratio measures), the association is statistically significant. The interval 1.2 to 3.8 represents the range of plausible values for the true OR. Note: the CI is a property of the procedure, not a probability statement about the parameter.

2. Why are confidence intervals for ratio measures (like OR) asymmetric around the point estimate?

The variance of ratio measures is computed on the log scale, where the CI is symmetric around ln(θ). When exponentiated back to the original scale, the CI becomes asymmetric around θ.

3. Which test statistic is generally considered superior in regression settings?

Likelihood ratio tests are generally superior to Wald tests, especially in regression settings. They compare the likelihood of the data under the estimated parameters versus the null hypothesis parameters.

✦ Pass the knowledge check with 100% and complete the reflection to continue

Section 5

Final Review & Assessment

⏱ Estimated time: 20 minutes

Lesson Summary

In this lesson, you explored the key measures used to quantify the association between an exposure and a disease outcome. Let’s review:

Section 1: Ratio Measures of Association

Three ratio measures compare disease frequency between exposed and non-exposed groups: risk ratio (RR), incidence rate ratio (IR), and odds ratio (OR). RR and IR are computed from cohort studies; OR is the only measure applicable to case-control studies due to its symmetry. When disease is rare, OR approximates RR.

Section 2: Measures of Effect in the Exposed

Risk difference (RD) measures the absolute increase in risk due to exposure. The attributable fraction (AFe) gives the proportion of disease in exposed individuals attributable to the exposure. Vaccine efficacy is a special case of AFe. AFe provides a lower bound for the etiologic fraction.

Section 3: Population-Level Measures

PAR and AFp extend effect measures to the population level. They depend on both the strength of association and the prevalence of exposure. A common weak risk factor can have greater population impact than a rare strong one. Different study designs support different measures.

Section 4: Hypothesis Testing & Confidence Intervals

Standard errors quantify precision. Hypothesis tests (Pearson χ², exact tests, Wald, LRT) produce P-values. Confidence intervals are more informative than P-values, showing the range of plausible effect sizes. CIs for ratio measures are computed on the log scale, producing asymmetric intervals.

Reflection

A colleague presents findings from a case-control study showing OR = 2.3 (95% CI: 1.1, 4.8) for the association between a workplace chemical exposure and bladder cancer. She concludes the chemical “causes 2.3 times the risk of bladder cancer.” Evaluate this statement. What can and cannot be concluded from this study? Discuss the roles of strength of association, statistical significance, the rare disease assumption, and the distinction between OR and RR.

Minimum 20 characters required.

✓ Reflection saved

Final Assessment

Complete all 15 questions below with 100% accuracy to finish this lesson. You must also complete the reflection above before submitting.

Final Assessment — Measures of Association

1. Measures of association differ from measures of statistical significance in that they:

Measures of association (RR, OR, etc.) assess the strength of the relationship between exposure and disease. Statistical significance (P-values) is heavily influenced by sample size, not the magnitude of the effect.

2. In a cohort study, 150 of 2,000 exposed individuals and 75 of 2,000 non-exposed individuals develop the disease. What is the risk ratio?

RR = (150/2000) / (75/2000) = 0.075 / 0.0375 = 2.00. The risk of disease is twice as high in the exposed group.

3. The odds ratio can be calculated from a case-control study because:

OR = (a1×b0)/(a0×b1) is the same whether viewed as the ratio of disease odds or the ratio of exposure odds. Since case-control studies sample based on disease status, only the OR can be validly computed.

4. RD = 0.043 in the smoking and low-birth-weight example means:

RD is the absolute difference in risk: 0.114 − 0.071 = 0.043. This means 4.3 additional cases per 100 exposed women, above what would be expected based on the baseline risk.

5. If RR = 4.0, the attributable fraction in the exposed (AFe) is:

AFe = (RR−1)/RR = (4−1)/4 = 3/4 = 0.75. So 75% of disease in the exposed group is attributable to the exposure.

6. Vaccine efficacy of 80% indicates:

Vaccine efficacy = AFe = (risk unvaccinated − risk vaccinated)/risk unvaccinated. A value of 80% means the vaccine prevented 80% of expected cases.

7. A common risk factor with a modest RR may have a larger AFp than a rare risk factor with a high RR because:

AFp = p(E+)(RR−1)/[p(E+)(RR−1)+1]. Both the RR and the prevalence of exposure (p(E+)) contribute. A high p(E+) with a modest RR can produce a larger AFp than a low p(E+) with a high RR.

8. PAR is best described as:

PAR = p(D+) − p(D+|E−). It reflects the increase in disease risk in the entire population that is attributable to the exposure, incorporating both the strength of association and how common the exposure is.

9. The null value for the risk ratio (RR) is:

For ratio measures (RR, IR, OR), the null value is 1, meaning the risk (or rate or odds) is the same in both groups. For difference measures (RD, ID), the null value is 0.

10. A 95% CI for RR of (0.85, 1.32) suggests:

Since the 95% CI includes the null value of 1, we cannot reject H0 at the 0.05 level. The range of plausible values spans from a modest protective effect (0.85) to a modest risk increase (1.32).

11. Confidence intervals for OR are asymmetric around the point estimate because:

The CI is symmetric on the ln(OR) scale. When exponentiated back to the OR scale, the interval becomes asymmetric because the exponential function is non-linear.

12. In the formula var(ln OR) = 1/a1 + 1/a0 + 1/b1 + 1/b0, increasing all cell counts will:

Since the variance is a sum of reciprocals of cell counts, larger cell counts produce smaller reciprocals, reducing the overall variance. This leads to a narrower (more precise) confidence interval.

13. Which of the following measures CANNOT be estimated from a case-control study, even with external data?

The incidence rate ratio requires person-time data from a cohort study. Case-control studies do not follow participants over time, so IR cannot be computed. AFe and AFp can be approximated using OR with appropriate external data.

14. The Pearson χ² test is most appropriate when:

The Pearson χ² has an approximate χ² distribution provided all expected cell values are >1 and at least 75% (3 of 4 cells) have expected values >5. For small samples, exact tests (like Fisher’s) are preferred.

15. A study finds RR = 1.8 (P = 0.40). Which interpretation is most appropriate?

An RR of 1.8 suggests a meaningful increase in risk. However, the P-value of 0.40 indicates the result is not statistically significant, likely due to insufficient sample size. A non-significant P-value does not prove the null hypothesis — it means we lack sufficient evidence to reject it. The CI would be more informative here.

✦ Complete the final reflection above before submitting