HSCI 230 — Lesson 10

Case-Control
Studies

Evaluating Epidemiological Research

Kiffer G. Card, PhD, Faculty of Health Sciences, Simon Fraser University

Learning objectives for this lesson:

  • Describe the major design features of risk-based and rate-based case-control studies
  • Identify hypotheses and population types consistent with each design
  • Differentiate between primary-base and secondary-base case-control studies
  • Elaborate the principles used to select and define the case series
  • Explain the principal features for selecting controls in open and closed populations
  • Design and implement a valid case-control study to meet specific objectives

This course was developed by Kiffer G. Card, PhD, as a companion to Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.

Section 1

Introduction & The Study Base

⏱ Estimated reading time: 15 minutes

Learning Objectives

  • Describe the fundamental logic of the case-control study design.
  • Distinguish between primary-base and secondary-base case-control studies.
  • Explain the concept of nested case-control studies.
  • Identify when case-control designs are performed prospectively vs. retrospectively.

What Is a Case-Control Study?

The basis of the case-control study design is to select individuals who have newly developed the disease or outcome of interest (the cases) and, as a comparison, individuals who have not developed the disease at the time of selection (the controls). We then contrast the frequency of exposure factors in the cases with the frequency of exposure factors in the controls.

Key Distinction

A case-control study is not a comparison between a set of cases and a set of ‘healthy’ subjects. It is a comparison between a set of cases and a set of non-case subjects (people who have not developed the specific disease but may have other diseases) whose exposure to the factors of interest reflects the exposure in the source population.

The controls would have been included as cases if they had developed the outcome (disease) of interest. Most frequently, individual people are the units of interest, but the design also applies to aggregates of individuals.

Source Population Cases (Diseased) Controls (Non-cases) Compare exposure Compare exposure Contrast exposure frequency

Figure — The logic of case-control design: select cases and controls from the same source population, then compare their exposure histories.

Usually, case-control studies are performed retrospectively since the outcome (usually disease) has occurred when the study begins. However, it is possible to conduct case-control studies prospectively; in these, the cases have not yet developed until after the study begins, so the cases are enrolled as they occur over time.

The Study Base

The study base is the population from which the cases and (possibly) the controls are obtained. The nature of the study base determines how controls should be selected.

Primary Base
Click to learn more
Secondary Base
Click to learn more
Nested Design
Click to learn more

Key Examples

Example 9.1 — Prospective Risk-Based (Serum Estradiol & Breast Cancer)

Dorgan et al (2010) used serum samples from a secondary-base case-control study. A total of 6,915 women who were free of cancer donated blood between 1977–1989. Of the 6,720 women in extended follow-up, 1,751 were identified as deceased. For each of the 117 potential cases, 2 potential controls were matched on age (±2 years), date (±1 year), and menstrual cycle day (±2 days). This is a risk-based sampling strategy. Conditional logistic regression was used to evaluate the association.

Example 9.2 — Primary-Base (Salmonella Typhimurium Risk Factors)

Dore et al (2004) conducted a rate-based study in Alberta, British Columbia, and Saskatchewan, Canada (Dec 1999–Nov 2000). Eligible cases had diarrheal illness with S. Typhimurium from stool samples. Controls were matched 1:1 on age and province of residence, randomly selected from provincial health registries. Cases and controls were interviewed by telephone using a pre-tested, standardised questionnaire covering demographics, health history, medication use, travel history, and animal contact.

Example 9.3 — Secondary-Base (Hypercholesterolemia & Prostate Cancer)

Magura et al (2008) used a risk-based, secondary-base case-control design. Cases were men newly diagnosed with prostate cancer at Meritcare hospital between 2004–2006. Controls were identified from the primary-care database of the same hospital: men without cancer, aged 50–74, who had annual physicals and lipid profiles within a year. Exclusion criteria included other cancers and non-Caucasian race. The authors used a widely accepted definition of hypercholesterolemia (total cholesterol >5.17 mmol/l) and estimated odds ratios using multiple logistic regression.

Example 9.4 — Nested Rate-Based (Gastroenteritis Risk Factors)

Rodrigo et al (2011) conducted a community-based (primary-base), nested, rate-based case-control study within a larger randomised controlled trial in South Australia. 300 households maintained weekly health diaries. The outcome — highly credible gastroenteritis (HCG) — was defined as 2+ loose stools, 2+ vomiting episodes, or combinations with abdominal pain/nausea in 24 hours. Controls were matched to cases by study week. Logistic regression was used, allowing for familial clustering and repeated observations.

Key Takeaways

  • Case-control studies select subjects based on disease status and look backward at exposure.
  • The study base can be a primary base (enumerable population) or secondary base (clinic/registry).
  • Nested designs allow estimation of disease frequency by exposure — a unique advantage.
  • Controls should represent the exposure experience of the source population that gave rise to the cases.

Reflection

Reflection

Consider a disease that is of interest to you. Would a primary-base or secondary-base case-control study be more feasible? What would be the advantages and trade-offs of each approach for your specific research question?

Minimum 20 characters required.

✓ Reflection saved
Knowledge Check — Section 1

1. In a case-control study, what do we compare between cases and controls?

Correct answer: C. The core logic of a case-control study is to compare the frequency of exposure factors in cases with the frequency of exposure factors in controls, to assess whether the exposure is associated with the outcome.

2. What distinguishes a primary study base from a secondary study base?

Correct answer: B. A primary study base is a well-defined source population for which there is, or could be, an explicit listing of potential study subjects. A secondary base is one or more steps removed from the actual source population.

3. What unique advantage does a nested case-control study provide?

Correct answer: D. Because the sampling fractions of cases and controls can be obtained in a nested design, it is possible to estimate the frequency of disease by exposure status — a feature absent in almost all other types of case-control studies.

4. Case-control studies are most commonly performed:

Correct answer: A. Usually case-control studies are performed retrospectively since the outcome has already occurred when the study begins. However, prospective case-control studies are also possible.
Section 2

The Case Series & Principles of Control Selection

⏱ Estimated reading time: 15 minutes

Learning Objectives

  • Describe the key elements in selecting and defining the case series.
  • Discuss the importance of diagnostic criteria and case ascertainment.
  • Articulate the four major principles of control selection.
  • Compare different sources of controls and their strengths and limitations.

The Case Series (Section 9.3)

Key elements in selecting the case series include: specifying the disease (including diagnostic criteria), identifying the source(s) of the cases, deciding whether only incident or both incident and prevalent cases are to be included, and estimating the required number of cases and total sample size.

Incident vs. Prevalent Cases

There is virtually unanimous agreement that, when possible, only incident cases should be used. There are specific circumstances where prevalent cases may be justified, but this would be the exception, not the rule. Usually, only the first occurrence of the outcome in each study subject is included (Examples 9.1 and 9.3); however, multiple occurrences of the same disease can be included (Example 9.4).

Where Do Cases Come From?

Primary-base cases come from a specific registry that contains virtually all cases for a defined population (e.g., provincial or state disease registries). Sampling or taking a census of cases directly from the primary source population avoids a number of potential selection biases, but may be more difficult to implement and more costly.

Primary-base designs are moderately common because provincial or state records allow complete enumeration of people and their health events.

Secondary-base cases are obtained from a physician’s clinic, one or more hospitals, or registries. A major challenge is to conceptualise the actual source population from which the cases arose. A common solution is to select controls from records at the same source (e.g., the same hospital; see Example 9.3).

Every effort should be made to obtain complete case ascertainment. In secondary-base studies, the set of cases from a tertiary care facility could become increasingly different from cases in the broader source population.

Diagnostic Criteria

The diagnostic criteria for a subject to become a case should include specific, well-defined manifestational (i.e., clinical) signs where appropriate and, when possible, clearly documented diagnostic criteria (e.g., laboratory test results) that can be applied to all study subjects in a uniform manner. In some instances, it might be desirable to subdivide the case series into subgroups based on differences in disease characteristics.

Principles of Control Selection (Section 9.4)

The selection of appropriate controls is often one of the most difficult aspects of a case-control design. The key guideline is that controls should be representative of the exposure experience in the population which gave rise to the cases.

The Four Major Principles

Wacholder et al (1992a; 1992b; 1992c) provide the classic discussions of control selection. The major principles are:

Same Study Base
Click to explore
Closed Population Rule
Click to explore
Open Population Rule
Click to explore
Eligibility Period
Click to explore

Sources of Controls

SourceStrengthsLimitations
Population controlsRepresentative of source populationLow response rates; recall bias; less motivated
Hospital controlsAccessible; cooperative; similar recall abilityExposure may be related to hospitalisation
Friend controlsSimilar recall; willing to participateOver-matching; biased estimates (Bunin et al, 2011)
Neighbourhood controlsSimilar socioeconomic backgroundIf neighbourhood related to exposure, causes bias
Random digit dialling (RDD)Population-representative samplingBusiness vs. home phone issues; declining response rates
Partner controlsShared environment; cooperativeAge-sex distribution differs; over-matching on exposures

Key Takeaways

  • Incident cases are strongly preferred over prevalent cases.
  • Cases can come from primary bases (registries) or secondary bases (clinics/hospitals).
  • Controls must represent the exposure experience of the source population.
  • The four key principles: same study base, closed/open population rules, and temporal eligibility.

Reflection

Reflection

Imagine you are studying whether a specific dietary factor is associated with colorectal cancer. You plan to recruit cases from a hospital. What type of control group would you select (hospital, population, friend, etc.) and why? What biases might arise from your choice?

Minimum 20 characters required.

✓ Reflection saved
Knowledge Check — Section 2

1. In case-control studies, which type of cases should preferably be used?

Correct answer: A. There is virtually unanimous agreement that when possible, only incident cases should be used. This avoids the biases that arise from studying prevalent cases.

2. The key guideline for valid control selection is that controls should be:

Correct answer: D. Controls should be representative of the exposure experience in the population which gave rise to the cases. They should be subjects who would have been included as cases if they had developed the outcome.

3. What is a major limitation of using hospital controls?

Correct answer: B. Hospital controls always pose the problem of whether their exposure is unrelated to the disease leading to their hospitalisation. If the exposure is related to hospitalisation, this can bias the measure of association.

4. Using friend controls in a case-control study can lead to:

Correct answer: C. Bunin et al (2011) found that using friend controls was convenient but led to potentially biased estimates of association because of over-matching on shared exposures and lifestyle factors.
Section 3

Controls in Risk-Based & Rate-Based Designs

⏱ Estimated reading time: 15 minutes

Learning Objectives

  • Describe the data layout and sampling approach for risk-based case-control studies.
  • Derive and interpret the odds ratio (OR) in a risk-based design (Eq 9.1).
  • Describe the data layout and incidence density sampling for rate-based case-control studies.
  • Explain why the OR estimates the risk ratio in risk-based designs and the rate ratio in rate-based designs.

Risk-Based Case-Control Designs (Section 9.5)

The traditional approach to case-control studies has been risk-based (cumulative incidence) design. Controls are selected from among the people that did not become cases by the end of the study period. A subject can be selected as a control only once.

Design Requirements

This design is appropriate if the population is closed and is most informative if the risk period for the outcome has ended before subject selection begins. It fits situations such as outbreaks from infectious or toxic agents where the risk period is short and essentially all cases have occurred within the defined study period.

2×2 Table: Risk-Based Case-Control Design

The closed-source population can be categorised with respect to exposure and outcome (upper-case = population, lower case = sample):

ExposedNon-exposedTotal
Casesa1a0m1
Controls (Non-cases)b1b0m0

The cases (M1) are those that arose during the study period, while the controls (M0) are those that remained free of the outcome. Usually all or most cases are included (sampling fraction sf among cases approaches 1). We select controls independently of exposure status so that the sampling fractions in the two exposure groups should be equal:

The measure of association in risk-based designs is the odds ratio (OR):

Eq 9.1 OR = (a1 / a0) ÷ (b1 / b0) = (a1 × b0) / (b1 × a0)

What Does the OR Estimate?

The OR is a valid measure of association in its own right. It also estimates the ratio of risks (RR) if the outcome is relatively infrequent (e.g., <5%) in the source population. Whether the OR approximates the RR or rate ratio depends on the study design and assumptions about the source population (Knol et al, 2008).

Rate-Based Case-Control Designs (Section 9.6)

Because the populations we study are often open, the case-control designs for these populations should use a rate-based approach (incidence density sampling), which ensures that the time-at-risk is taken into account when control subjects are selected.

2×2 Table: Rate-Based Case-Control Design

ExposedNon-exposedTotal
CasesA1A0M1
Person-time at riskT1T0T

Recall that in a cohort study, the two rates of interest would be:

Eq 9.2 I1 = A1 / T1      and      I0 = A0 / T0

In a rate-based case-control study, we select controls using a sampling rate (sr) that is equal in exposed and non-exposed populations:

Eq 9.3 sr = b1 / T1 ≈ b0 / T0

Therefore, the ratio of exposed to unexposed controls equals the ratio of the cumulative exposed and unexposed subject times:

Eq 9.4 b1 / b0 ≈ T1 / T0

This means the OR from the case-control data estimates the incidence rate ratio (IR) in the source population:

Eq 9.5 (a1/b1) / (a0/b0) ≈ (A1/T1) / (A0/T0)

Key Advantage of Rate-Based Design

In this design, the OR estimates the IR (from a cohort study) and no assumption about rarity of outcome is necessary for a valid estimate. This is a major advantage over risk-based designs where the rare disease assumption is needed for the OR to approximate the RR.

Incidence Density Sampling

The most common method of obtaining controls is by selecting a specified number of non-cases from the risk set, matched time-wise to the occurrence of each case. This is called incidence density sampling. At each time a subject develops the outcome, we choose b controls from the non-case subjects that exist in the source population at that point. Key features:

  • We do not need to know the time-at-risk for potential controls.
  • We do not need to assume the population is stable.
  • The number of controls per case can vary.
  • Subjects initially identified as controls can subsequently become cases.
  • Controls can subsequently become cases (and vice versa in rate-based designs).

Key Takeaways

  • Risk-based designs use closed populations; the OR estimates the RR when the outcome is rare (Eq 9.1).
  • Rate-based designs use open populations and incidence density sampling (Eqs 9.2–9.5).
  • In rate-based designs, the OR directly estimates the IR with no rarity assumption needed.
  • Incidence density sampling matches controls to cases by time of occurrence.

Reflection

Reflection

Why is the distinction between risk-based and rate-based case-control designs important for interpreting the odds ratio? In what situations would you recommend a rate-based design over a risk-based design, and how would this affect control selection?

Minimum 20 characters required.

✓ Reflection saved
Knowledge Check — Section 3

1. In a risk-based case-control study, controls are selected from:

Correct answer: B. In a risk-based (cumulative incidence) design, controls are selected from among the people that did not become cases by the end of the study period. The population must be closed.

2. The odds ratio in a risk-based case-control study estimates the risk ratio when:

Correct answer: C. The OR estimates the ratio of risks (RR) when the outcome is relatively infrequent (e.g., <5%) in the source population. This is known as the rare disease assumption.

3. What is the key advantage of the rate-based OR over the risk-based OR?

Correct answer: D. In rate-based designs, the OR estimates the IR (incidence rate ratio from a cohort study) without requiring an assumption about the rarity of the outcome.

4. In incidence density sampling, at each time a case occurs we select controls from:

Correct answer: A. In incidence density sampling, controls are selected from the risk set of non-case subjects at the time each case occurs. Subjects initially identified as controls can subsequently become cases.
Section 4

Comparability, Analysis & Reporting

⏱ Estimated reading time: 15 minutes

Learning Objectives

  • Discuss the number of controls per case and the use of multiple control groups.
  • Describe exposure and covariate assessment in case-control studies.
  • Explain the three approaches to keeping cases and controls comparable.
  • Describe the analysis of case-control data and STROBE reporting guidelines.

Number of Controls per Case (Section 9.8)

Most studies use a 1:1 case-control ratio; however, other than being statistically efficient, there is nothing magical about this ratio. If the information on covariates and exposure is already recorded (i.e., exposure data is ‘free’), one might use all qualifying non-cases as controls to avoid sampling issues.

Practical Guidelines

When the number of cases is small, the precision of association measures can be improved by selecting more than one control per case. There are formal approaches for deciding the optimal number, but usually the benefit of increasing the number of controls per case is small; often 3–4 controls per case is the practical maximum.

Number of Control Groups (Section 9.9)

Some researchers use multiple control groups to balance a perceived bias with one specific control group (Examples 9.5 and 9.6). However, this should be clearly defined, as it adds complexity and can be difficult to interpret if the different control groups produce different results.

Example 9.5 — Secondary-Base Study with Population Controls

Abubakar et al (2007) studied Crohn’s disease risk factors from 9 hospitals in England using both hospital-derived and community controls. The a priori design was matched with 104 cases. For community controls, 2 general practitioners per Crohn’s patient were randomly selected, matched by age (±1 year) and gender. The authors noted that the choice of control group had little impact on their results.

Example 9.6 — Primary-Care and Population-Based Controls

Brenner et al (2010) evaluated lung cancer risk factors in never-smokers in Toronto. They used both population-based controls (randomly sampled from property tax files, n=425) and hospital-based controls (from a family medicine clinic, n=523). Unconditional logistic regression models were used. A separate analysis based on 156 non-smoking cases with 466 non-smoking controls confirmed the main findings.

Exposure & Covariate Assessment (Section 9.10)

Most case-control studies are retrospective, so a concise, workable definition of ‘exposure’ (and also of confounders) is needed when implementing the study design. When ascertaining exposure status and information on confounders, it is preferable to obtain the greatest accuracy possible using the same process for both cases and controls.

General Rules for Exposure Assessment

When possible, have data collectors blinded to case status. As a general rule, the exposure status of cases should be the exposure category that existed at the time of outcome occurrence. For controls, their exposure status reflects their exposure situation at the time of their selection.

Keeping Cases and Controls Comparable (Section 9.11)

To obtain unbiased estimates, covariates related to both the outcome and the exposure should have a similar distribution in cases and controls. Three approaches can be used:

Exclusion / Inclusion
Click to learn more
Matching
Click to learn more
Analytic Control
Click to learn more

Analysis of Case-Control Data (Section 9.12)

The data format and analysis for both risk-based and rate-based designs proceeds in a similar manner. In a 2×2 table:

ExposedNon-exposedTotal
Casesa1a0m1
Controlsb1b0m0

Remember that we cannot directly estimate disease frequency (unless the study is nested) because the m1:m0 ratio was fixed by the sampling design. Chapter 6 outlines the analysis including hypothesis testing, estimating the odds ratio, and developing confidence intervals.

With risk-based designs and sampling of controls at the end of the follow-up period, the odds ratio estimates the risk ratio if the frequency of disease in the source population is low (e.g., below 10%), and censoring is unrelated to exposure.

If concurrent sampling (incidence density sampling) is used, the odds ratio estimates the rate ratio in both closed and open populations. For validity, stability of exposure is needed in the closed population but not in the open population.

When controls are selected from an open population without concurrent sampling of controls, the odds ratio estimates the rate ratio only if the population is stable, otherwise it is just the odds ratio. If matching is used to select controls but is ignored in the analysis, the impact depends on the extent of exposure changes during the study period (Knol et al, 2008).

Reporting Guidelines (Section 9.13)

Vandenbroucke et al (2007) described the key elements of case-control studies that should be reported (STROBE). The complete listing is in Table 7.3; items specific to case-control studies are included in Table 9.1.

Table 9.1 — STROBE Items Specific to Case-Control Studies

Methods:

  • Item 6a: Give the eligibility criteria, and the sources and methods of case ascertainment and control selection. Give the rationale for the choice of cases and controls.
  • Item 6b: For matched studies, give matching criteria and the number of controls per case.
  • Item 12: If applicable, explain how matching of cases and controls was addressed.

Results:

  • Item 15: Report numbers in each exposure category, or summary measures of exposure.

Key Takeaways

  • 3–4 controls per case is usually the practical maximum for improving precision.
  • Multiple control groups add complexity; the general experience is that more than one control group has limited value.
  • Exposure assessment should use the same process for cases and controls, with blinding when possible.
  • Comparability is achieved through exclusion, matching, or analytic control (multivariable techniques).
  • What the OR estimates (RR or IR) depends on the study design and sampling approach.

Reflection

Reflection

A colleague presents a case-control study with an odds ratio of 2.5 and asks: “Does this mean exposed people have 2.5 times the risk?” How would you respond? Consider the study design (risk-based vs. rate-based), the rarity of the outcome, and what the OR actually estimates under different conditions.

Minimum 20 characters required.

✓ Reflection saved
Knowledge Check — Section 4

1. What is the practical maximum number of controls per case in most case-control studies?

Correct answer: C. While increasing the number of controls per case improves precision, the benefit diminishes; often 3–4 controls per case is the practical maximum.

2. What approach to preventing confounding is ‘most often relied upon’ in case-control studies?

Correct answer: B. When there are numerous potential confounders, matching is often impractical. Analytic control using multivariable techniques is the approach most often relied upon, sometimes combined with restricted sampling.

3. When concurrent (incidence density) sampling is used, the OR estimates:

Correct answer: A. When incidence density sampling is used, the OR estimates the rate ratio in both closed and open populations, without requiring the rare disease assumption.

4. According to STROBE guidelines for case-control studies, which of the following should be reported?

Correct answer: D. STROBE Item 6a specifies that investigators should give the eligibility criteria, sources and methods of case ascertainment and control selection, and give the rationale for the choice of cases and controls.
Section 5

Final Review & Assessment

⏱ Estimated time: 20 minutes

Lesson Summary

In this lesson, you have explored the design, implementation, analysis, and reporting of case-control studies. You have learned to distinguish between primary-base and secondary-base designs, understand the principles of control selection, compare risk-based and rate-based sampling approaches, and apply STROBE reporting guidelines.

Core Concepts Reviewed

Section 1: Case-control study logic, primary vs. secondary study base, nested designs, prospective vs. retrospective designs.

Section 2: Case series selection, incident vs. prevalent cases, diagnostic criteria, four principles of control selection, sources of controls.

Section 3: Risk-based designs and the OR (Eq 9.1), rate-based designs and incidence density sampling (Eqs 9.2–9.5), what the OR estimates.

Section 4: Number of controls, multiple control groups, exposure assessment, comparability (exclusion, matching, analytic control), analysis and STROBE reporting.

Final Reflection

Final Reflection

Design a brief case-control study proposal for a health question of your choice. Specify: (1) the research question, (2) whether you would use a primary or secondary study base and why, (3) how you would define and identify cases, (4) how you would select controls and from what source, (5) whether a risk-based or rate-based design is more appropriate, and (6) how you would ensure comparability.

Minimum 20 characters required.

✓ Reflection saved

Final Assessment

This assessment covers all sections of Lesson 10. You must score 100% to complete the lesson. Review the feedback after each attempt.

Final Assessment — Lesson 10: Case-Control Studies (15 Questions)

1. The fundamental logic of a case-control study is to:

Correct answer: C. Case-control studies select cases (diseased) and controls (non-diseased) and then compare the frequency of exposure factors between the two groups to assess association.

2. A case-control study is NOT a comparison between cases and:

Correct answer: B. A case-control study is not a comparison between cases and ‘healthy’ subjects. Controls are non-case subjects who may have other diseases, whose exposure should reflect the exposure in the source population.

3. A secondary study base refers to a source population that is:

Correct answer: A. A secondary study base is one or more steps removed from the actual source population, such as people at a referral clinic, laboratory, or central registry.

4. A unique advantage of a nested case-control study is that it can:

Correct answer: D. In nested designs, sampling fractions of cases and controls can be obtained, allowing estimation of disease frequency by exposure status — a feature absent in most other case-control designs.

5. Why are incident cases preferred over prevalent cases?

Correct answer: B. There is virtually unanimous agreement that incident cases should be used when possible, as prevalent cases can introduce biases related to duration of disease, survival, and changes in exposure over time.

6. According to the principles of control selection, controls should:

Correct answer: C. The key guideline is that controls should be representative of the exposure experience in the population which gave rise to the cases. They should be people who would have become cases had they developed the disease.

7. The odds ratio in a risk-based case-control study (Eq 9.1) is calculated as:

Correct answer: A. The OR is the cross-product ratio: (a1 × b0) / (b1 × a0), which equals the ratio of the odds of exposure in cases to the odds of exposure in controls.

8. In rate-based case-control designs, the OR estimates the incidence rate ratio because:

Correct answer: D. In rate-based designs, controls are sampled using a sampling rate (sr) that is equal in exposed and non-exposed populations, so the ratio of exposed to unexposed controls reflects the ratio of person-time at risk (Eqs 9.3–9.4).

9. What is incidence density sampling?

Correct answer: B. Incidence density sampling selects a specified number of non-cases from the risk set, matched time-wise, to the occurrence of each case. Controls initially identified can subsequently become cases.

10. What is the practical maximum number of controls per case before benefits diminish?

Correct answer: C. While more controls improve precision, the benefit of increasing the number diminishes quickly; 3–4 controls per case is typically the practical maximum.

11. Why might using friend controls lead to biased estimates?

Correct answer: A. Bunin et al (2011) found that friend controls led to biased estimates of association because of over-matching — friends tend to share similar lifestyle factors and exposures as the cases.

12. In Example 9.3, the secondary-base case-control study of prostate cancer used controls from:

Correct answer: D. Magura et al (2008) identified controls from the primary-care database of the same hospital — men without cancer, aged 50–74, who had annual physicals and lipid profile tests.

13. When ascertaining exposure in case-control studies, what is recommended?

Correct answer: B. The process of ascertaining exposure history should have comparable accuracy in both groups. Using the same data collection methods and, when possible, having data collectors blinded to case status reduces information bias.

14. The general experience regarding multiple control groups is that:

Correct answer: C. Pomp et al (2010) note that the general experience is that the value of more than one control group is very limited. If different control groups produce different results, it can be difficult to determine which is correct.

15. According to STROBE, which item is specific to case-control study reporting?

Correct answer: A. STROBE Item 6a for case-control studies specifies giving the eligibility criteria, sources and methods of case ascertainment and control selection, and providing the rationale for the choice of cases and controls.

🎉 Congratulations!

You have completed Lesson 10: Case-Control Studies.

You now understand the design, implementation, analysis, and reporting of case-control studies, including risk-based and rate-based designs, control selection principles, and STROBE reporting guidelines.