HSCI 410 — Lesson 6

Modelling Ordinal & Multinomial Data

Exploratory Data Analysis For Epidemiology

Kiffer G. Card, PhD, Faculty of Health Sciences, Simon Fraser University

Learning objectives for this lesson:

  • Select an appropriate model (multinomial, proportional-odds, adjacent-category, or continuation-ratio) based on study objectives and data
  • Fit all of the models listed above
  • Evaluate the assumptions on which each model is based
  • Interpret OR estimates from each model
  • Compute predicted probabilities from each model

This course was developed by Kiffer G. Card, PhD, as a companion to Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.

Reference

Glossary — Key Terms, People & Concepts

📚 Reference page — available throughout the lesson

This glossary collects the key concepts, people, and ideas you will meet in this lesson. Use it as a reference while you work through the material, or as a review before assessments. Type in the search box to filter entries.

Key Concepts & Ideas
Ordinal Outcome A categorical outcome with a natural order but unequal or unknown spacing between levels (e.g., mild/moderate/severe; Likert scales). Logical target for cumulative-logit models.
Nominal (Multinomial) Outcome A categorical outcome with three or more unordered categories (e.g., disease subtype A/B/C). Modeled with multinomial logistic regression.
Baseline / Reference Category The category of a categorical outcome (or predictor) that other levels are compared against. Choice affects coefficient interpretation but not overall fit.
Cumulative Probability The probability that the outcome falls at or below a particular ordinal level. Cumulative-logit models work on logits of these cumulative probabilities.
Latent-Variable Formulation A motivation for ordinal models in which an unobserved continuous variable is sliced into ordered categories by thresholds. Justifies the proportional-odds structure.
Methods & Statistical Concepts
Proportional Odds Assumption In cumulative-logit models, the assumption that the effect of each predictor is the same across all cut-points of the outcome. Tested with a Brant or score test.
Cumulative Logit Model An ordinal regression that models log-odds of cumulative probabilities: logit(P(Y ≤ j)) = α₃ − Xβ. Yields a single β per predictor under proportional odds.
Partial Proportional Odds A relaxation of proportional odds in which some predictors have category-specific coefficients while others remain constrained. Useful when proportional odds fails for only a few variables.
Brant Test A formal test of the proportional-odds assumption that compares fitted coefficients across binary logit splits at each cut-point.
Adjacent-Category Logit An ordinal model that compares each category to the next, rather than cumulating. An alternative to proportional odds for ordered outcomes.
Continuation-Ratio Model An ordinal model that compares each category against all higher categories combined. Useful when categories represent a sequential process.
Multinomial Logit A model for nominal outcomes that fits k−1 separate logits, each comparing a non-reference category against the baseline. Yields category-specific coefficients.
IIA (Independence of Irrelevant Alternatives) An assumption of the multinomial logit: relative odds between any two categories don't depend on which other categories are available. Tested with the Hausman-McFadden test.
Probit Link An alternative link function (inverse normal CDF) used in some ordinal and binary models. Gives similar results to logit but uses standard-normal thresholds.
VGAM / polr R packages/functions for ordinal regression. MASS::polr fits proportional-odds models; VGAM::vglm handles a broader family including partial-PO and continuation-ratio.
Key People
Peter McCullagh (1952– ) Irish statistician who introduced the cumulative-logit (proportional-odds) model in his 1980 paper, providing the foundation for modern ordinal regression.
Alan Agresti (1947– ) American statistician whose textbook Categorical Data Analysis is the standard reference on ordinal, nominal, and contingency-table methods.
Rollin Brant Statistician who developed the eponymous test (1990) for the proportional-odds assumption.
No matching entries. Try a different search term.
Section 1

Introduction & Overview of Models

⏱ Estimated time: 20 minutes

Introduction and Overview

Lesson 5 covered logistic regression for binary outcomes. Lesson 6 extends the framework to outcomes with more than two categories — nominal (no natural order) or ordinal (ordered). The four content sections walk through the major models in order: an overview of the four-model toolkit (Section 1), the multinomial logistic model for nominal outcomes (Section 2), the proportional-odds model for ordinal outcomes including how to test the proportional-odds assumption (Section 3), and finally adjacent-category and continuation-ratio models as alternatives when proportional-odds doesn't hold (Section 4). For an epidemiology-focused overview that maps all four models onto the same dataset, see Ananth & Kleinbaum (1997); the open-access encyclopedia entries on Ordinal regression (Wikipedia, 2026) and Multinomial logistic regression (Wikipedia, 2026) summarise the same toolkit at an undergraduate level.

Learning Objectives

  • Distinguish nominal from ordinal outcomes and explain why each calls for a different modelling strategy.
  • Map the four logits used by multinomial, proportional-odds, adjacent-category, and continuation-ratio models.
  • Use the Apgar-score example to anticipate how each model will partition the same outcome.
  • Choose an initial model based on the structure and ordering of the outcome categories.

When Outcomes Have More Than Two Categories

In many epidemiological studies the outcome variable has more than two categories. These outcomes fall into two broad types: nominal data, where the categories have no natural ordering (e.g., type of disease, preferred clinic), and ordinal data, where the categories are ordered (e.g., pain severity: none, mild, moderate, severe).

The choice of model depends on whether the outcome is nominal or ordinal. Nominal data require multinomial logistic regression or log-linear models. Ordinal data can be analysed with the same multinomial model (ignoring the ordering), but more efficient approaches exploit the ordering: proportional-odds, adjacent-category, and continuation-ratio models (McCullagh, 1980; Ananth & Kleinbaum, 1997).

📈
Nominal Data
Click to learn more
📊
Ordinal Data
Click to learn more
💡
Choosing the Right Model
Click to learn more

The Apgar Score Example

Throughout Chapter 17, the authors use Apgar scores as a running example. Apgar scores (measured at birth) are recoded into four ordinal categories. The research question is whether the number of prenatal visits is associated with Apgar score category.

Apgar CategoryCodePrenatal Visits < 6Prenatal Visits ≥ 6Total
1–6 (Low)0472572
71484290
825972131
9–10 (High)3134227361
Total288366654

Overview of the Four Models

Each of the four models for multi-category outcomes uses a different formulation of the logit (log-odds). Understanding the logit structure is the key to understanding each model.

Multinomial Logistic Regression (Eq 17.1)

Compares each outcome category to a baseline category. For J categories, the model estimates J−1 sets of coefficients. Each set describes how predictors relate to the log-odds of being in category j versus the baseline.

Equation 17.1
ln[p(Y = j) / p(Y = 1)] = β0(j) + β1(j)X

No assumptions about ordering are made, so this model is appropriate for both nominal and ordinal outcomes (though it is less efficient for ordinal data).

Proportional-Odds Model (Eq 17.2)

Based on cumulative probabilities. The logit compares the probability of being at or above category j versus below it. A single coefficient per predictor applies at every cutpoint—the proportional-odds assumption.

Equation 17.2
ln[p(Y ≥ j) / p(Y < j)] = β0(j) + β1X

This is the most common ordinal logistic model and is more parsimonious than the multinomial model (McCullagh, 1980).

Adjacent-Category Model (Eq 17.3)

Compares each category to the adjacent (next lower) category. This model is a constrained version of the multinomial model where the coefficient for categories n levels apart equals n times the coefficient for adjacent categories.

Equation 17.3
ln[p(Y = j) / p(Y = j−1)] = β0(j) + β1X

Like the proportional-odds model, it estimates a single β1 per predictor.

Continuation-Ratio Model (Eq 17.4)

Compares each category to all lower categories combined. This model is especially appropriate when the outcome represents sequential stages that must be “passed through” to reach higher levels (e.g., number of attempts to achieve certification).

Equation 17.4
ln[p(Y = j) / p(Y < j)] = β0(j) + β1X

Can be fit as a series of separate binary logistic regressions with appropriately recoded outcome variables.

Knowledge Check — Section 1

1. Nominal outcome data differ from ordinal outcome data in that:

Nominal data have categories with no inherent ranking or ordering (e.g., type of disease), while ordinal data have categories that can be meaningfully ordered (e.g., disease severity). The number of categories is not related to the distinction.

2. How many sets of coefficients does a multinomial logistic model estimate for J outcome categories?

A multinomial logistic model compares each of the J−1 non-baseline categories to the baseline category. Each comparison requires its own set of coefficients, yielding J−1 sets in total.

3. Which model assumes the effect of a predictor is the same across all cutpoints?

The proportional-odds model estimates a single coefficient per predictor that applies at every cutpoint of the outcome. This is the “proportional odds” assumption—the OR is the same regardless of where you divide the outcome categories.

✎ Reflection

Think of an ordinal outcome variable from your own field of study. What are the categories, and which of the four models introduced here do you think would be most appropriate? Why?

Model answerPick a real ordinal outcome (e.g., self-rated health: poor, fair, good, very good, excellent). Most appropriate model: proportional odds (cumulative logit) as the default because it preserves the ordering and gives interpretable cumulative ORs; falls back to partial proportional odds if Brant's test flags violations; falls back to multinomial logistic if the ordinal structure is implausible (e.g., the ‘fair’ category mixes truly intermediate health with mistranslated responses); continuation-ratio is appropriate if the outcome represents progression through stages (e.g., disease progression from healthy → mild → moderate → severe, with each stage as a hurdle), but not for cross-sectional self-rated health where all categories are observed simultaneously.
✓ Reflection saved!
Complete the quiz and reflection to continue.
Section 2

Multinomial Logistic Regression

⏱ Estimated time: 25 minutes

Introduction and Overview

Section 1 mapped the four-model landscape. Section 2 walks into the first one in detail: the multinomial logistic model. This is the most general option — it works on any categorical outcome, ordered or not — but you pay for that generality with a coefficient for every comparison and many more parameters to interpret.

Learning Objectives

  • Set up a multinomial logistic model as J−1 simultaneous binary logits against a chosen baseline.
  • Interpret exponentiated coefficients as relative risk ratios for non-baseline versus baseline categories.
  • Compute predicted probabilities for every outcome category from the joint set of linear predictors.
  • Recognise the independence-of-irrelevant-alternatives (IIA) assumption and its implications.

The Multinomial Logistic Model

The multinomial logistic model simultaneously fits J−1 separate logistic models, each comparing one category to a chosen baseline. All parameters are estimated jointly, so the model accounts for the correlation among the comparisons.

Predicted Probabilities

The predicted probability for each outcome category is computed from the set of linear predictors. Let Xβ(j) denote the linear predictor for category j.

Equation 17.5 — Probability for the baseline category
p(Y = 0) = 1 / [1 + exp(Xβ(1)) + exp(Xβ(2)) + exp(Xβ(3))]
Equation 17.6 — Probability for category j
p(Y = j) = exp(Xβ(j)) / [1 + exp(Xβ(1)) + exp(Xβ(2)) + exp(Xβ(3))]

These probabilities always sum to 1 across all categories. Each predicted probability depends on all sets of coefficients, not just the coefficients for that category.

Interpreting Odds Ratios

Exponentiated coefficients from a multinomial model are technically ratios of relative risks (RRR), not true odds ratios. Each exp(β(j)) gives the ratio of the probability of being in category j relative to the baseline, for a one-unit change in the predictor.

Because the multinomial model estimates separate coefficients for each comparison, the effects can differ across categories. For ordinal outcomes, you would typically expect a gradient—more pronounced effects for the categories furthest from the baseline.

Predicted Probabilities

Predicted probabilities from a multinomial model vary by the values of all predictors. To communicate results, it is often useful to compute predicted probabilities at specific covariate patterns (e.g., prenatal visits < 6 vs ≥ 6) and present them in a table or graph.

Testing significance can be done with Wald tests (for individual coefficients) or likelihood-ratio tests (for overall effects). Because the multinomial model has J−1 coefficients per predictor, an overall LRT that tests all J−1 simultaneously is generally preferred over examining individual coefficients.

Independence of Irrelevant Alternatives (IIA)

The multinomial logistic model assumes IIA: the odds of choosing one category over another are independent of what other categories are available. If this assumption is violated, adding or removing a category would change the odds between the remaining categories.

Two tests are available: the Hausman-McFadden test and the Small-Hsiao test. However, these tests often give conflicting results, and IIA violations are primarily a concern for nominal data where alternatives are genuinely substitutable (e.g., choosing a mode of transport). For ordinal data, IIA is rarely a practical concern (Wikipedia, 2026).

📋 Example: Apgar Scores and Prenatal Visits

In the Apgar score example, the multinomial model (with category 3 [9–10] as baseline) produced the following key results for prenatal visits (≥6 vs <6):

  • Category 0 vs 3: OR = 0.24 — those with ≥6 visits have 76% lower relative risk of a low Apgar score
  • Category 1 vs 3: OR = 0.65 — 35% lower relative risk of Apgar = 7
  • Category 2 vs 3: OR = 0.72 — 28% lower relative risk of Apgar = 8

The gradient (0.24 → 0.65 → 0.72) shows the strongest effect for the lowest Apgar category, as expected for an ordinal outcome.

Note on IIA Tests

The Hausman-McFadden and Small-Hsiao tests for IIA often give conflicting results and are not always reliable. In practice, IIA is mainly a concern for nominal (unordered) outcomes with genuinely substitutable alternatives. For ordinal outcomes, it is rarely problematic. Regression diagnostics can be performed by fitting ordinary logistic models for pairs of categories and using standard diagnostic techniques.

Alternative-Specific Data

In some situations, predictors may vary across alternatives rather than (or in addition to) varying across observations. For example, in a study of clinic choice, the distance to each clinic varies by alternative. Special formulations of the multinomial model (conditional logit or mixed logit) accommodate such alternative-specific data.

Knowledge Check — Section 2

1. In multinomial logistic regression, the exponentiated coefficients represent:

The exponentiated coefficients in a multinomial logistic model are ratios of relative risks (RRR), not true odds ratios. Each RRR compares the probability of being in a given category relative to the baseline for a one-unit change in the predictor.

2. The IIA assumption states that:

The Independence of Irrelevant Alternatives (IIA) assumption states that the relative odds of choosing one category over another do not depend on what other categories are available. Violation of IIA means that adding or removing a category would change the estimated odds between remaining categories.

3. How are predicted probabilities computed from a multinomial model?

The predicted probability for category j is exp(Xβ(j)) divided by [1 + Σ exp(Xβ(k))] for all non-baseline categories k. For the baseline category, the probability is 1 divided by the same denominator. This ensures all probabilities sum to 1.

✎ Reflection

Consider the Apgar score example. Why do you think the OR for the lowest Apgar category (0.24) is more extreme than for the middle categories? What does this gradient tell us about prenatal care and birth outcomes?

Model answerThe lowest Apgar category (0–3, immediate distress) has an OR of 0.24 (relative to the highest 7–10) for prenatal-care access. The more extreme OR at the lowest category reflects two mechanisms: (a) severity gradient — the most distressed neonates have the most concentrated risk factors (substance exposure, late prenatal care, undiagnosed conditions), all of which prenatal care could address; (b) preventability — severe outcomes have more identifiable causes and more leverage from intervention. The gradient (smaller OR at lower Apgar, OR closer to 1 at middle Apgar) tells us prenatal care does the most good where the marginal value of detection is highest — it prevents the catastrophic outcomes more efficiently than it prevents intermediate ones. This is the classical “high-risk targeting” pattern in public-health interventions.
✓ Reflection saved!
Complete the quiz and reflection to continue.
Section 3

Proportional-Odds Model

⏱ Estimated time: 25 minutes

Introduction and Overview

Section 2 used multinomial logistic regression to fit a model that ignores any ordering in the outcome categories. Section 3 takes the more parsimonious route: when the categories are ordered, the proportional-odds model uses far fewer parameters by assuming the effect of each predictor is the same across all category cut-points. The trade-off is that you have to verify the proportional-odds assumption holds.

Learning Objectives

  • Express the proportional-odds (cumulative logit) model in terms of an underlying latent continuous variable and cutpoints.
  • Interpret a single odds ratio as the effect of a predictor across every dichotomisation of the ordinal outcome.
  • Test the proportional-odds assumption using the score (Brant) test or by comparing nested models.
  • Diagnose what to do when proportional-odds clearly fails for a given predictor.

The Most Common Ordinal Model

The proportional-odds model (also called the cumulative logit model or ordinal logistic regression) is the most widely used model for ordinal outcomes. It was formalised by McCullagh (1980) and is based on the idea of an underlying continuous latent variable that is divided into the observed ordinal categories by a series of cutpoints (Wikipedia, 2026).

Equation 17.7 — Latent Variable
Si = β1X1i + β2X2i + … + βkXki + εi

The latent variable Si is divided by cutpoints (τ1, τ2, …, τJ−1) into J observed categories. If Si falls between τj−1 and τj, the observation is classified into category j.

The Proportional-Odds Logit

The model takes the form of a cumulative logit: logit(p(Y ≥ j)) = β0j + βX. The key feature is that the intercept varies across cutpoints (giving parallel lines on a logit scale) but the slope coefficients are the same for every cutpoint. This means a single OR summarises the effect of each predictor across all levels of the outcome (McCullagh, 1980).

Equation 17.9 — Predicted Probability from Latent Variable
p(Y = j) = p(S ≤ τj) − p(S ≤ τj−1)
📋 Example: Apgar Scores — Proportional-Odds Model

In the Apgar score example, the proportional-odds model yields an OR of 1.59 for prenatal visits (≥6 vs <6). This means that individuals with 6 or more prenatal visits have 1.59 times the odds of being at or above any given Apgar category, compared to those with fewer visits. This single OR applies at every cutpoint (0 vs 1+, 0–1 vs 2+, and 0–2 vs 3).

R Activity — Multinomial and proportional-odds models in R

The course dataset phaa_survey_clean.csv has both an unordered multi-category variable (region — 5 Lower-Mainland regions) and an ordered one (education — 5 levels). The full annotated script is in r-activities/HSCI_410_Lesson_6_Ordinal_and_Multinomial_Models.R.

library(nnet);  library(MASS);  library(brant)
phaa <- read.csv("phaa_survey_clean.csv", stringsAsFactors = FALSE)
phaa$region    <- relevel(factor(phaa$region), ref = "Vancouver")
phaa$education <- factor(phaa$education,
  levels = c("Less than high school", "High school", "Some college",
             "Bachelor's", "Graduate degree"),
  ordered = TRUE)

# 1. MULTINOMIAL: nominal outcome (region)
mn <- multinom(region ~ age + gender + smoker, data = phaa, trace = FALSE)
exp(coef(mn))                                           # relative-risk ratios
z <- summary(mn)$coefficients / summary(mn)$standard.errors
round((1 - pnorm(abs(z))) * 2, 3)                          # p-values

# 2. PROPORTIONAL-ODDS: ordered outcome (education)
po <- polr(education ~ age + gender + smoker, data = phaa, Hess = TRUE)
exp(cbind(OR = coef(po), confint(po)))                  # OR + 95% CI

# 3. Brant test: is the proportional-odds assumption defensible?
brant(po)

Reading the multinomial output. The relative-risk ratios from multinom() compare each region to the reference (Vancouver). For polr(), a single OR per predictor applies between every adjacent pair of education levels — that is the proportional-odds assumption that brant() tests. If the Brant overall p-value is < .05, fall back to the multinomial fit (or to a partial-PO model in the stretch).

R Reflect on what you just ran

Use the questions below to interpret the output you produced. Look at your console / plot before answering.

1. From exp(coef(mn)) and the matching p-value matrix, pick one non-reference region (anything but Vancouver) and one predictor (e.g., smokerYes). Report the relative-risk ratio, and state in one sentence what it says about the relative likelihood of living in that region versus Vancouver.

Model answerPicking smokerYes in the Fraser-Valley vs. Vancouver contrast as an example: the relative-risk ratio (RRR) might be around 1.45 (95% CI varies). Interpretation: smokers have a 45% higher likelihood of living in the Fraser Valley rather than Vancouver, compared with non-smokers, controlling for other covariates in the model. RRR > 1 means the predictor increases the relative likelihood of that category over the reference category.

2. From exp(cbind(OR = coef(po), confint(po))), report the OR for age and its 95% CI. Under the proportional-odds assumption, translate this OR into a sentence about being at-or-above any given education level.

Model answerThe OR for age in the proportional-odds model is typically around 1.02–1.04 (95% CI clearly excluding 1) per year of age. Interpretation under proportional odds: a one-year increase in age is associated with a 2–4% increase in the odds of being at-or-above any given education level. Because the OR is constant across cut-points (the proportional-odds assumption), this single OR summarises the age effect on each cumulative comparison — e.g., the odds of being “Some college or higher” vs. “HS or less,” or “Bachelor's or higher” vs. “Some college or less.”

3. Look at the brant(po) output. What is the overall p-value, and do any individual predictors flag a violation (p < .05)? Based on this, would you keep po or fall back to mn_edu (compare AICs)?

Model answerbrant() typically returns an overall p-value around 0.01–0.05; if the global is significant, some individual predictors will also show p < .05, indicating violation of the proportional-odds assumption. If violations are flagged: (a) compare AIC of po vs. mn_edu (the multinomial alternative); (b) if mn_edu has substantially lower AIC, prefer the multinomial; (c) alternatively, fit a partial proportional odds model that relaxes the assumption only for the violating predictors. The cleanest practical move when the assumption is violated and the categories are ordered is the partial proportional-odds model; the alternative-category-specific log-odds estimates retain ordinality of the outcome while admitting heterogeneous effects.
Saved.

Testing the Proportional-Odds Assumption

The proportional-odds assumption is that the effect of each predictor is the same at every cutpoint. If violated, the model may give misleading results. Several tests are available:

Testing Proportional Odds

Three main approaches exist:

  • Approximate LRT: Compare the log-likelihoods of the proportional-odds model and the multinomial model. A significant difference suggests the proportional-odds assumption is violated.
  • Wolfe-Gould approximate LRT: Based on J−1 separate binary logistic models at each cutpoint. Sum the log-likelihoods and compare to the proportional-odds model.
  • Brant (Wald) test: Provides both an overall test and individual tests for each predictor, showing which specific variables violate the assumption (Brant, 1990).
Generalised Ordinal Logistic Regression

If the proportional-odds assumption is violated, a generalised ordinal logistic regression model allows separate coefficients at each cutpoint. This model is equivalent to fitting J−1 separate binary logistic regressions simultaneously. It is more flexible but less parsimonious than the proportional-odds model (Williams, 2006).

Partial Proportional-Odds Model

A compromise approach is the partial proportional-odds model, which relaxes the proportional-odds assumption for selected predictors only (those that fail the Brant test) while maintaining it for the rest (Peterson & Harrell, 1990; Williams, 2006). This provides a good balance between flexibility and parsimony. Other alternatives include the stereotype logistic model and the heterogeneous choice logistic model.

⚠ The Proportional-Odds Assumption in Practice

The proportional-odds assumption is often violated in practice, especially with many predictors or when the outcome categories represent very different phenomena. Always test this assumption before reporting results from a proportional-odds model (Brant, 1990). If violated, consider a partial proportional-odds model or generalised ordinal logistic regression (Peterson & Harrell, 1990; Williams, 2006).

Brant Test Results Example

The Brant test provides both an overall test and predictor-specific tests. Here is an example of how results might be presented:

Predictorχ²dfP-valueAssumption Holds?
Prenatal visits2.1420.343Yes
Maternal age8.9220.012No
Parity1.0320.598Yes
Overall12.4560.053Borderline

In this example, only maternal age violates the assumption. A partial proportional-odds model that allows maternal age to have different effects at each cutpoint (while constraining prenatal visits and parity) would be appropriate.

Regression Diagnostics

Regression diagnostics for the proportional-odds model can be conducted by fitting binary logistic models at each cutpoint and applying the diagnostic techniques from Chapter 16 (residual analysis, influence measures, goodness-of-fit tests).

Knowledge Check — Section 3

1. The proportional-odds model assumes:

The proportional-odds (or parallel lines) assumption states that the coefficients for each predictor are the same regardless of which cutpoint is used to dichotomise the outcome. This means the OR is the same at every cutpoint.

2. If the proportional-odds assumption is violated for some but not all predictors, which model can be used?

The partial proportional-odds model relaxes the proportional-odds assumption for specific predictors (those that violate it) while maintaining the constraint for the rest. This provides a balance between the fully constrained proportional-odds model and the unconstrained multinomial model.

3. The latent variable in a proportional-odds model represents:

The proportional-odds model assumes an underlying continuous latent variable (Si) that is divided into the observed ordinal categories by cutpoints (τ). The latent variable represents the true continuous quantity that we observe only in categorised form.

✎ Reflection

Why do you think the proportional-odds assumption is so often violated in practice? Can you think of a scenario in your own research where you would expect the effect of a predictor to differ across cutpoints?

Model answerThe proportional-odds assumption requires that the effect of every predictor is the same across all cut-points of the ordinal outcome — a strong assumption rarely true in practice. It is often violated because: (a) floor/ceiling effects — predictors may have stronger effects at extreme categories than middle ones; (b) non-linearity in the latent variable — if the underlying construct varies non-linearly, the cut-point ORs differ; (c) category-specific mechanisms — the cause of moving from severe to moderate is different from the cause of moving from mild to none. Example: income's effect on educational attainment may be very strong at high levels (deciding between PhD vs. Master's) and weak at low levels (deciding between any-HS vs. dropout, where structural barriers dominate). Pre-specify Brant tests and have a fallback model (partial proportional odds, generalised ordinal, or multinomial) in the protocol.
✓ Reflection saved!
Complete the quiz and reflection to continue.
Section 4

Adjacent-Category & Continuation-Ratio Models

⏱ Estimated time: 20 minutes

Introduction and Overview

The proportional-odds model is the workhorse for ordered outcomes, but its proportional-odds assumption is genuinely restrictive and frequently fails in real data. Section 4 closes the lesson with two alternatives that relax that assumption in different ways: the adjacent-category model and the continuation-ratio model. Each is appropriate for a different kind of ordering and a different research question.

Learning Objectives

  • Specify the adjacent-category model and test its constraint against an unconstrained multinomial fit.
  • Specify the continuation-ratio model and recognise the sequential-stage outcomes it suits.
  • Fit a continuation-ratio model as a series of binary logistic regressions on recoded data.
  • Choose between proportional-odds, adjacent-category, and continuation-ratio models based on the question and the data.

Adjacent-Category Model

The adjacent-category model compares the probability of being in category j versus category j−1 (the next lower category). It is a constrained version of the multinomial logistic model: the constraint is that the coefficient for categories n levels apart equals n times the coefficient for adjacent categories.

Like the proportional-odds model, the adjacent-category model estimates a single β1 per predictor, making it more parsimonious than the unconstrained multinomial model. The validity of this constraint can be tested by comparing the adjacent-category model to the unconstrained multinomial model using a likelihood-ratio test (LRT) (Ananth & Kleinbaum, 1997).

📋 Example: Apgar Scores — Adjacent-Category Model

For the Apgar score data, the LRT comparing the adjacent-category model to the unconstrained multinomial model yielded χ² = 6.76, df = 5, P = 0.239. Since this is not significant, the adjacent-category model is a valid simplification of the multinomial model for these data.

Continuation-Ratio Model

The continuation-ratio model compares the probability of being in category j versus all lower categories combined. It is particularly useful when the outcome represents sequential stages that must be “passed through” to reach higher levels (Ananth & Kleinbaum, 1997).

When Is the Continuation-Ratio Model Appropriate?

The continuation-ratio model is ideal for outcomes where each level must be reached before the next can be attained. Examples include: number of attempts to pass an exam, stages of disease progression where remission must occur before relapse, or sequential rounds of a selection process. It is NOT appropriate when movements between categories are not sequential (e.g., Apgar scores, where a baby does not “pass through” each score level).

Fitting the Continuation-Ratio Model

The continuation-ratio model can be fit as a series of separate binary logistic regressions with a recoded outcome variable. For each comparison:

  • Y = 1 for the level of interest
  • Y = 0 for all lower levels
  • Observations at higher levels are excluded (treated as missing)

Consider an example with 4 categories representing the number of attempts to gain admission to medical school (1, 2, 3, 4+):

Original CategoryY1 (1 vs 0)Y2 (2 vs 0–1)Y3 (3 vs 0–2)
0 (1 attempt)000
1 (2 attempts)100
2 (3 attempts)10
3 (4+ attempts)1

You can fit either a constrained version (equal ORs across levels, tested by LRT) or an unconstrained version (separate ORs at each level). The constrained version is more parsimonious and can be compared to the unconstrained version using a likelihood-ratio test.

Adjacent-Category Model
Click to learn more
Continuation-Ratio Model
Click to learn more
🔧
Choosing Between Models
Click to learn more
When to Use the Adjacent-Category Model

The adjacent-category model is appropriate when the comparison of interest is between neighbouring categories of an ordinal outcome. It is a natural choice when you believe the effect of a predictor operates by shifting individuals one category at a time. The model can be validated by comparing it to the unconstrained multinomial model via LRT.

When to Use the Continuation-Ratio Model

The continuation-ratio model is most appropriate when the outcome represents sequential stages that must be passed through in order. Each category must be reached before the next can be attained. Examples include: successive attempts at an exam, sequential rounds of treatment, or stages of career advancement. If categories can be reached without passing through lower levels, this model is not appropriate.

Comparing Models with LRT

When one model is a constrained (nested) version of another, the likelihood-ratio test can be used to compare them. The test statistic is −2(lnLconstrained − lnLunconstrained), which follows a χ² distribution with degrees of freedom equal to the difference in the number of parameters. A significant result suggests the constraint is not valid and the more complex model is needed.

Decision Guide: Choosing Among the Four Models

Step 1: Is the outcome nominal or ordinal? If nominal, use multinomial logistic regression.
Step 2: If ordinal, does the outcome represent sequential stages? If yes, consider the continuation-ratio model.
Step 3: If not sequential, fit the proportional-odds model and test the assumption. If it holds, use proportional-odds.
Step 4: If the proportional-odds assumption fails, consider the adjacent-category model, partial proportional-odds, or generalised ordinal logistic regression.
Step 5: Compare nested models using LRT to select the most parsimonious adequate model.

Knowledge Check — Section 4

1. In the adjacent-category model, the coefficient for categories n levels apart is:

The adjacent-category model is a constrained multinomial model. The key constraint is that the log-odds (coefficient) for categories n levels apart equals n times the log-odds for adjacent categories. This is what makes it more parsimonious than the unconstrained multinomial model.

2. The continuation-ratio model is most appropriate when:

The continuation-ratio model is designed for outcomes where each category represents a stage that must be reached before moving to the next. Each comparison asks: given that you reached at least this level, what are the odds of reaching the next level? This is inappropriate when categories can be reached without passing through lower levels.

3. If an LRT comparing the adjacent-category model to the multinomial model is NOT significant, this suggests:

A non-significant LRT means the constrained (adjacent-category) model does not fit significantly worse than the unconstrained (multinomial) model. Therefore, the simpler adjacent-category model is a valid simplification and should be preferred on grounds of parsimony.

✎ Reflection

Can you think of an example from public health or epidemiology where a continuation-ratio model would be more appropriate than a proportional-odds model? What makes the outcome sequential in your example?

Model answerExample: TB disease progression — latent infection → primary TB → chronic TB → death. Each stage is a hurdle: you cannot get to stage 3 without progressing through stages 1 and 2. A continuation-ratio model estimates the probability of progressing past each stage conditional on having reached it; predictors can have different effects on each transition (e.g., HIV co-infection might strongly increase risk of progression from latent to active disease but have less effect on progression from active to chronic). The proportional-odds model wouldn't capture this because it assumes a single underlying latent severity with uniform predictor effects across cut-points; the continuation-ratio model explicitly models the sequential hurdle structure. Other examples: cancer staging (in-situ → local → regional → distant), pregnancy outcomes (no pregnancy → early loss → mid-trimester loss → preterm birth → term).
✓ Reflection saved!
Complete the quiz and reflection to continue.
Final Assessment

Lesson 6 — Final Assessment

15 questions • 100% required to pass

Bringing It All Together

This lesson extended the binary toolkit of Lesson 5 to outcomes with three or more categories. We started with the four-model landscape and the Apgar-score example, then walked through each model in turn: the multinomial logit (general but parameter-hungry), the proportional-odds model (parsimonious for ordinal data when its assumption holds), and the adjacent-category and continuation-ratio models (alternatives when proportional-odds fails or the categories are sequential).

The recurring theme is that the choice of logit determines what each coefficient means. A single estimand — the effect of prenatal visits on Apgar score — takes a different shape under each model, and the right shape depends on the structure of the outcome and the question being asked. This is the same lesson you will see again with rate, time-to-event, and clustered outcomes in the lessons that follow.

The final assessment asks you to recognise which model fits a given outcome and to interpret its coefficients without sliding back into the binary-logistic vocabulary by reflex.

Key Takeaways from Lesson 6

  • Nominal vs ordinal outcomes call for different families of models — ordering is information you can either use (parsimony) or ignore (generality).
  • Multinomial logistic regression fits J−1 simultaneous logits against a baseline; coefficients are interpreted as relative risk ratios.
  • The proportional-odds model assumes a single coefficient applies across every cumulative cutpoint; this assumption must be tested, not assumed.
  • The adjacent-category model is a constrained multinomial that compares each category with its neighbour and produces one coefficient per predictor.
  • The continuation-ratio model is appropriate for sequential-stage outcomes and can be fit as a series of binary logistic regressions on recoded data.
  • Always start by sketching the logit your model uses; the choice of comparison set is what makes any of these models “ordinal” or “nominal”.

✎ Final Reflection

Now that you have completed all four sections, summarise the key differences among the four models for multi-category outcomes. When would you choose each one, and what assumptions would you need to verify?

Model answerMultinomial logistic: outcomes with > 2 categories that are unordered (e.g., choice of region, ethnic group). Verify: no logical ordering exists; observations independent. Proportional odds (cumulative logit): ordered categories with same effect direction across cut-points. Verify: Brant's test, parallel-regression assumption. Partial proportional odds / generalised ordinal: ordered categories where the assumption fails for some predictors. Verify: residual heterogeneity patterns. Continuation-ratio: ordered categories that represent sequential transitions (each stage is a hurdle). Verify: subject-matter knowledge that progression is sequential rather than simultaneous. Choice depends on (a) ordinal vs. nominal structure of outcome, (b) whether the proportional-odds assumption is met, (c) whether the data-generating process is sequential or cross-sectional. Always run residual diagnostics, compare AIC across candidates, and report the rationale for model choice rather than defaulting to whichever software fits first.
✓ Reflection saved!
Final Assessment — Lesson 6 (15 Questions)

1. What type of outcome data has categories with no natural ordering?

Nominal data have categories with no inherent ordering or ranking. Examples include type of disease, ethnicity, or mode of transport. Ordinal data, by contrast, have a meaningful ordering among categories.

2. How many sets of coefficients does a multinomial model with 4 outcome categories estimate?

A multinomial logistic model with J = 4 categories estimates J−1 = 3 sets of coefficients, each comparing one non-baseline category to the baseline category.

3. In a proportional-odds model, the OR for a predictor represents:

The proportional-odds assumption means that a single OR applies at every cutpoint. Regardless of where you dichotomise the ordinal outcome, the estimated OR for each predictor is the same.

4. The logit in a proportional-odds model is based on:

The proportional-odds model uses a cumulative logit: the log-odds of being at or above category j versus being below it. This cumulative formulation is what allows a single coefficient to apply at every cutpoint.

5. The IIA assumption in multinomial logistic regression means:

The Independence of Irrelevant Alternatives (IIA) assumption states that the relative probability (odds) of any two outcomes does not change when other alternatives are added to or removed from the choice set.

6. If the proportional-odds assumption is clearly violated, a good alternative is:

The generalised ordinal logistic regression model allows different coefficients at each cutpoint, relaxing the proportional-odds assumption entirely. It is equivalent to fitting J−1 binary logistic models simultaneously and is a flexible alternative when the assumption is violated.

7. In a multinomial model, exponentiated coefficients are technically:

Exponentiated coefficients from a multinomial logistic model are ratios of relative risks (RRR). They compare the probability of being in one category relative to the baseline, but they are not true odds ratios as in standard binary logistic regression.

8. The Brant test evaluates:

The Brant test is a Wald-type test that evaluates the proportional-odds assumption. It provides both an overall test and individual tests for each predictor, identifying which specific variables violate the assumption of equal effects across cutpoints.

9. The adjacent-category model is a constrained version of:

The adjacent-category model is a constrained version of the multinomial logistic model. The constraint is that the coefficient for categories n levels apart equals n times the coefficient for adjacent categories. This can be tested by comparing the two models with a likelihood-ratio test.

10. Continuation-ratio models are best suited for:

Continuation-ratio models are designed for outcomes where each level must be “passed through” to reach higher levels. The model asks: given that you reached at least level j, what are the odds of reaching level j+1? This is ideal for sequential or staged processes.

11. The latent variable in a proportional-odds model has cutpoints (τ) that:

The cutpoints (τ) divide the underlying continuous latent variable into the observed ordinal categories. An observation is classified into category j if its latent variable score falls between τj−1 and τj. These cutpoints are estimated from the data along with the regression coefficients.

12. To compare nested ordinal models, you should use:

The likelihood ratio test (LRT) is the standard approach for comparing nested models. The test statistic is −2 times the difference in log-likelihoods, which follows a chi-squared distribution with degrees of freedom equal to the difference in the number of parameters.

13. If the LRT comparing proportional-odds to multinomial models is significant:

A significant LRT means the constrained (proportional-odds) model fits significantly worse than the unconstrained (multinomial) model. This suggests the proportional-odds constraint is too restrictive and the assumption may be violated. You should consider a more flexible model.

14. In a continuation-ratio model, observations at higher levels than the one being modelled are:

In a continuation-ratio model, each binary regression compares level j (Y = 1) to all lower levels (Y = 0). Observations at levels higher than j are excluded from that particular comparison because they have already “passed through” the level being modelled.

15. Which model is most parsimonious for ordinal data when the proportional-odds assumption holds?

When the proportional-odds assumption holds, the proportional-odds model is the most parsimonious because it estimates only one coefficient per predictor (compared to J−1 per predictor in the multinomial or generalised ordinal models). Parsimony leads to more precise estimates and more powerful tests.

🏆 Congratulations!

Lesson 7 — Count and Rate Data — takes the GLM family to its next member: outcomes that are counts (number of events). Poisson and negative binomial regression are the standard tools, and the offset terms you'll meet there are how rate data become tractable in the same framework.

You have successfully completed Lesson 6: Modelling Ordinal & Multinomial Data.

Your responses have been downloaded automatically.