Modelling Ordinal & Multinomial Data

Exploratory Data Analysis For Epidemiology

Learning objectives for this lesson:

Select an appropriate model (multinomial, proportional-odds, adjacent-category, or continuation-ratio) based on study objectives and data
Fit all of the models listed above
Evaluate the assumptions on which each model is based
Interpret OR estimates from each model
Compute predicted probabilities from each model

This course was developed by Dr. Kiffer G. Card, Faculty of Health Sciences, Simon Fraser University based on Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.

Reference

Glossary: Key Terms, People & Concepts

📚 Reference page, available throughout the lesson

This glossary collects the key concepts, people, and ideas you will meet in this lesson. Use it as a reference while you work through the material, or as a review before assessments. Type in the search box to filter entries.

Key Concepts & Ideas

Ordinal Outcome A categorical outcome with a natural order but unequal or unknown spacing between levels (e.g., mild/moderate/severe; Likert scales). Logical target for cumulative-logit models.

Nominal (Multinomial) Outcome A categorical outcome with three or more unordered categories (e.g., disease subtype A/B/C). Modeled with multinomial logistic regression.

Baseline / Reference Category The category of a categorical outcome (or predictor) that other levels are compared against. Choice affects coefficient interpretation but not overall fit.

Cumulative Probability The probability that the outcome falls at or below a particular ordinal level. Cumulative-logit models work on logits of these cumulative probabilities.

Latent-Variable Formulation A motivation for ordinal models in which an unobserved continuous variable is sliced into ordered categories by thresholds. Justifies the proportional-odds structure.

Methods & Statistical Concepts

Proportional Odds Assumption In cumulative-logit models, the assumption that the effect of each predictor is the same across all cut-points of the outcome. Tested with a Brant or score test.

Cumulative Logit Model An ordinal regression that models log-odds of cumulative probabilities: logit(P(Y ≥ j)) = αⱼ + Xβ. Yields a single β per predictor under proportional odds.

Partial Proportional Odds A relaxation of proportional odds in which some predictors have category-specific coefficients while others remain constrained. Useful when proportional odds fails for only a few variables.

Brant Test A formal test of the proportional-odds assumption that compares fitted coefficients across binary logit splits at each cut-point.

Adjacent-Category Logit An ordinal model that compares each category to the next, rather than cumulating. An alternative to proportional odds for ordered outcomes.

Continuation-Ratio Model An ordinal model that compares each category against all higher categories combined. Useful when categories represent a sequential process.

Multinomial Logit A model for nominal outcomes that fits k−1 separate logits, each comparing a non-reference category against the baseline. Yields category-specific coefficients.

IIA (Independence of Irrelevant Alternatives) An assumption of the multinomial logit: relative odds between any two categories don't depend on which other categories are available. Tested with the Hausman-McFadden test.

Probit Link An alternative link function (inverse normal CDF) used in some ordinal and binary models. Gives similar results to logit but uses standard-normal thresholds.

VGAM / polr R packages/functions for ordinal regression. MASS::polr fits proportional-odds models; VGAM::vglm handles a broader family including partial-PO and continuation-ratio.

Key People

Peter McCullagh (1952– ) Irish statistician who introduced the cumulative-logit (proportional-odds) model in his 1980 paper, providing the foundation for modern ordinal regression.

Alan Agresti (1947– ) American statistician whose textbook Categorical Data Analysis is the standard reference on ordinal, nominal, and contingency-table methods.

Rollin Brant Statistician who developed the eponymous test (1990) for the proportional-odds assumption.

No matching entries. Try a different search term.

Section 1

Introduction & Overview of Models

⏱ Estimated time: 20 minutes

Lesson 6

Modelling Ordinal & Multinomial Data

Four ways to handle an outcome that has more than two categories, ordered or not.

Section 1 of 4

Introduction & Overview of Models

Nominal versus ordinal, the Apgar example, and the four logit formulations.

Two types

Nominal vs ordinal outcomes

Nominal

Categories with no meaningful rank. Modelled with multinomial logistic regression, which makes no ordering assumption.

Ordinal

Categories with a meaningful rank but unknown, unequal spacing. Ordinal models exploit the ordering for parsimony.

The ordering distinction drives model choice. Using a nominal model on ordinal data wastes information; imposing order on nominal data distorts inference.

Running example

Apgar scores and prenatal visits

Apgar Category	Code	<6 Visits	≥6 Visits	Total
1–6 (Low)	0	47	25	72
7	1	48	42	90
8	2	59	72	131
9–10 (High)	3	134	227	361

Each model in this lesson will ask a different question about these same 654 births.

Four logits

What each model compares

Multinomial logit

Each category vs. a chosen baseline. Category-specific coefficients. Works on nominal and ordinal data.

Proportional-odds (cumulative logit)

Probability of being at or above level j vs. below it. Single coefficient per predictor.

Adjacent-category logit

Category j vs. the category immediately below. Single constrained coefficient per predictor.

Continuation-ratio logit

Category j vs. all categories below j combined. Best for sequential-stage outcomes.

The four equations

Logit formulations side by side

Multinomial (Eq 17.1)

\[ \ln\frac{\color{#0B7B6B}{P(Y=j)}}{\color{#C2410C}{P(Y=1)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1^{(j)}} X \]

P(Y=j) prob. of category j P(Y=1) prob. of baseline β₀ᵃ category intercept β₁ᵃ category slope

Proportional-odds (Eq 17.2)

\[ \ln\frac{\color{#0B7B6B}{P(Y \ge j)}}{\color{#C2410C}{P(Y < j)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1} X \]

P(Y≥j) at or above j P(Y<j) below j β₀ᵃ cutpoint intercept β₁ shared slope

Adjacent-category (Eq 17.3)

\[ \ln\frac{\color{#0B7B6B}{P(Y=j)}}{\color{#C2410C}{P(Y=j-1)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1} X \]

P(Y=j) category j P(Y=j−1) category just below β₀ᵃ category intercept β₁ common slope

Continuation-ratio (Eq 17.4)

\[ \ln\frac{\color{#0B7B6B}{P(Y=j)}}{\color{#C2410C}{P(Y < j)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1} X \]

P(Y=j) reaching stage j P(Y<j) stopping earlier β₀ᵃ stage intercept β₁ common slope

Carry forward

What to take into the next section

Nominal vs ordinal is the first decision. Ordering is information you can use or discard.
All four models are logistic models. They differ in what each logit compares.
Ordinal models gain parsimony by sharing one slope coefficient across cutpoints, but that constraint must be tested.

Introduction and Overview

An earlier lesson covered logistic regression for binary outcomes. This lesson extends the framework to outcomes with more than two categories, whether nominal (no natural order) or ordinal (ordered). The four content sections walk through the major models in order: an overview of the four-model toolkit (this section), the multinomial logistic model for nominal outcomes (a later section), the proportional-odds model for ordinal outcomes including how to test the proportional-odds assumption (a later section), and finally adjacent-category and continuation-ratio models as alternatives when proportional-odds doesn't hold (a later section). For an epidemiology-focused overview that maps all four models onto the same dataset, see Ananth & Kleinbaum (1997); the open-access encyclopedia entries on ordinal regression and multinomial logistic regression summarise the same toolkit at an undergraduate level.

Learning Objectives

Distinguish nominal from ordinal outcomes and explain why each calls for a different modelling strategy.
Map the four logits used by multinomial, proportional-odds, adjacent-category, and continuation-ratio models.
Use the Apgar-score example to anticipate how each model will partition the same outcome.
Choose an initial model based on the structure and ordering of the outcome categories.

When Outcomes Have More Than Two Categories

In many epidemiological studies the outcome variable has more than two categories. These outcomes fall into two broad types: nominal data, where the categories have no natural ordering (e.g., type of disease, preferred clinic), and ordinal data, where the categories are ordered (e.g., pain severity: none, mild, moderate, severe).

The choice of model depends on whether the outcome is nominal or ordinal. Nominal data require multinomial logistic regression or log-linear models. Ordinal data can be analysed with the same multinomial model (ignoring the ordering), but more efficient approaches exploit the ordering: proportional-odds, adjacent-category, and continuation-ratio models (McCullagh, 1980; Ananth & Kleinbaum, 1997).

Nominal DataClick to explore

Ordinal DataClick to explore

Choosing the Right ModelClick to explore

The Apgar Score Example

Throughout Chapter 17, the authors use Apgar scores as a running example. Apgar scores (measured at birth) are recoded into four ordinal categories. The research question is whether the number of prenatal visits is associated with Apgar score category.

Apgar Category	Code	Prenatal Visits < 6	Prenatal Visits ≥ 6	Total
1–6 (Low)	0	47	25	72
7	1	48	42	90
8	2	59	72	131
9–10 (High)	3	134	227	361
Total		288	366	654

Overview of the Four Models

Each of the four models for multi-category outcomes uses a different formulation of the logit (log-odds). Understanding the logit structure is the key to understanding each model.

Multinomial Logistic Regression (Eq 17.1)

Compares each outcome category to a baseline category. For J categories, the model estimates J−1 sets of coefficients. Each set describes how predictors relate to the log-odds of being in category j versus the baseline.

Multinomial logit (Eq 17.1)

\[ \ln\!\frac{\color{#0B7B6B}{p(Y = j)}}{\color{#C2410C}{p(Y = 1)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1^{(j)}} X \]

The log of the odds of category j versus the baseline category is a category-specific intercept plus a category-specific slope times the predictor. Each non-baseline category gets its own coefficients.

No assumptions about ordering are made, so this model is appropriate for both nominal and ordinal outcomes (though it is less efficient for ordinal data).

Proportional-Odds Model (Eq 17.2)

Based on cumulative probabilities. The logit compares the probability of being at or above category j versus below it. A single coefficient per predictor applies at every cutpoint, the proportional-odds assumption.

Proportional-odds (cumulative) logit (Eq 17.2)

\[ \ln\!\frac{\color{#0B7B6B}{p(Y \ge j)}}{\color{#C2410C}{p(Y < j)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1} X \]

The log-odds of being at or above category j versus below it uses a cutpoint-specific intercept but a single slope shared across cutpoints, which is the proportional-odds assumption.

This is the most common ordinal logistic model and is more parsimonious than the multinomial model (McCullagh, 1980).

Adjacent-Category Model (Eq 17.3)

Compares each category to the adjacent (next lower) category. This model is a constrained version of the multinomial model where the coefficient for categories n levels apart equals n times the coefficient for adjacent categories.

Adjacent-category logit (Eq 17.3)

\[ \ln\!\frac{\color{#0B7B6B}{p(Y = j)}}{\color{#C2410C}{p(Y = j-1)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1} X \]

The log-odds compare each category with the one just below it, using a category intercept and a common slope.

Like the proportional-odds model, it estimates a single β₁ per predictor.

Continuation-Ratio Model (Eq 17.4)

Compares each category to all lower categories combined. This model is especially appropriate when the outcome represents sequential stages that must be “passed through” to reach higher levels (e.g., number of attempts to achieve certification).

Continuation-ratio logit (Eq 17.4)

\[ \ln\!\frac{\color{#0B7B6B}{p(Y = j)}}{\color{#C2410C}{p(Y < j)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1} X \]

The log-odds compare reaching category j with stopping below it, useful when the categories represent sequential stages.

Can be fit as a series of separate binary logistic regressions with appropriately recoded outcome variables.

✎ Reflection

Think of an ordinal outcome variable from your own field of study. What are the categories, and which of the four models introduced here do you think would be most appropriate? Why?

Model answerPick a real ordinal outcome (e.g., self-rated health: poor, fair, good, very good, excellent). Most appropriate model: proportional odds (cumulative logit) as the default because it preserves the ordering and gives interpretable cumulative ORs; falls back to partial proportional odds if Brant's test flags violations; falls back to multinomial logistic if the ordinal structure is implausible (e.g., the ‘fair’ category mixes truly intermediate health with mistranslated responses); continuation-ratio is appropriate if the outcome represents progression through stages (e.g., disease progression from healthy → mild → moderate → severe, with each stage as a hurdle), but not for cross-sectional self-rated health where all categories are observed simultaneously.

✓ Reflection saved!

● Complete the quiz and reflection to continue.

Section 3

Proportional-Odds Model

⏱ Estimated time: 25 minutes

Section 3 of 4

Proportional-Odds Model

The workhorse for ordered outcomes. One coefficient per predictor, one test to verify it.

The motivation

A latent continuous variable divided by cutpoints

Eq 17.7: latent variable

\[ \color{#0B7B6B}{S_i} = \color{#1D4ED8}{\beta_1 X_{1i}} + \color{#1D4ED8}{\beta_2 X_{2i}} + \cdots + \color{#BE185D}{\varepsilon_i} \]

Sᵢ latent score for person i βⱼXⱼᵢ predictor contributions εᵢ error term

Predictors shift the whole distribution left or right; the cutpoints stay fixed.

The equation

Cumulative logit: one slope across all cutpoints

Proportional-odds model

\[ \text{logit}\,\color{#0B7B6B}{P(Y \ge j)} = \color{#6D28D9}{\beta_{0j}} + \color{#1D4ED8}{\beta_1} X \]

P(Y≥j) cumulative probability β₀ⱼ cutpoint intercept β₁ shared slope

The intercept varies by cutpoint j, but β₁ is the same everywhere. One odds ratio per predictor summarises the effect across the entire ordinal outcome.

Individuals with 6 or more prenatal visits have 1.59 times the odds of being at or above any given Apgar category.Apgar example, Chapter 17

Testing the assumption

The Brant test

The Brant test fits binary logistic models at each cutpoint and asks: do the coefficients differ significantly across cutpoints?

Predictor	χ²	df	P-value	Holds?
Prenatal visits	2.14	2	0.343	Yes
Maternal age	8.92	2	0.012	No
Parity	1.03	2	0.598	Yes
Overall	12.45	6	0.053	Borderline

Only maternal age violates the assumption here. A partial proportional-odds model would constrain all predictors except maternal age.

When the assumption fails

Remedies: partial and generalised models

Partial proportional-odds

Relaxes the constraint only for predictors that fail the Brant test. A good balance between fit and parsimony.

Generalised ordinal logit

Allows separate coefficients at each cutpoint. Equivalent to J minus 1 binary logistic regressions simultaneously.

Fall back to multinomial

If all predictors violate proportional odds, the unconstrained multinomial model is the honest fallback.

Carry forward

What to take into the next section

The proportional-odds model gives one OR per predictor, valid at every cumulative cutpoint, but only if the assumption holds.
The Brant test checks whether coefficients differ across cutpoints. Predictor-specific results guide partial models.
When the assumption fails, options include the partial proportional-odds model, the generalised ordinal logit, and the multinomial fallback.

Introduction and Overview

An earlier section used multinomial logistic regression to fit a model that ignores any ordering in the outcome categories. This section takes the more parsimonious route: when the categories are ordered, the proportional-odds model uses far fewer parameters by assuming the effect of each predictor is the same across all category cut-points. The trade-off is that you have to verify the proportional-odds assumption holds.

Learning Objectives

Express the proportional-odds (cumulative logit) model in terms of an underlying latent continuous variable and cutpoints.
Interpret a single odds ratio as the effect of a predictor across every dichotomisation of the ordinal outcome.
Test the proportional-odds assumption using the score (Brant) test or by comparing nested models.
Diagnose what to do when proportional-odds clearly fails for a given predictor.

The Most Common Ordinal Model

The proportional-odds model (also called the cumulative logit model or ordinal logistic regression) is the most widely used model for ordinal outcomes. It was formalised by McCullagh (1980) and is based on the idea of an underlying continuous latent variable that is divided into the observed ordinal categories by a series of cutpoints, the structure of the ordered logit model.

Equation 17.7: latent variable

\[ \color{#0B7B6B}{S_i} = \color{#1D4ED8}{\beta_1 X_{1i}} + \color{#1D4ED8}{\beta_2 X_{2i}} + \cdots + \color{#1D4ED8}{\beta_k X_{ki}} + \color{#BE185D}{\varepsilon_i} \]

An unobserved latent score is a linear combination of the predictors plus an error term; observed ordinal categories are bands of this score.

The latent variable S_i is divided by cutpoints (τ₁, τ₂, …, τ_J−1) into J observed categories. If S_i falls between τ_j−1 and τ_j, the observation is classified into category j.

The Proportional-Odds Logit

The model takes the form of a cumulative logit: logit(p(Y ≥ j)) = β_0j + βX. The key feature is that the intercept varies across cutpoints (giving parallel lines on a logit scale) but the slope coefficients are the same for every cutpoint. This means a single OR summarises the effect of each predictor across all levels of the outcome (McCullagh, 1980).

Proportional odds in plain words. The predictor multiplies the odds by the same amount wherever you split the ordered outcome. If six or more prenatal visits multiply the odds of scoring above the lowest Apgar category by 1.59, they multiply the odds of reaching the top category by that same 1.59. You get a single odds ratio, and it applies at every cutpoint. That economy is the reason to reach for the model, and the proportional-odds assumption is the one thing you have to check before trusting it.

Equation 17.9: predicted probability from the latent variable

\[ \color{#0B7B6B}{p(Y = j)} = p(\color{#1D4ED8}{S} \le \color{#C2410C}{\tau_j}) - p(\color{#1D4ED8}{S} \le \color{#C2410C}{\tau_{j-1}}) \]

The probability of category j is the chance the latent score falls between two adjacent cutpoints.

A single bell-shaped latent distribution divided into four shaded bands by three dashed cutpoint lines, labelled None, Mild, Moderate, and Severe. — The latent-variable view: one continuous, unobserved score underlies the ordinal outcome, and the cutpoints τ slice it into the observed categories. The probability of a category is the area of the distribution between its two cutpoints.

📋 Example: Apgar scores, proportional-odds model

In the Apgar score example, the proportional-odds model yields an OR of 1.59 for prenatal visits (≥6 vs <6). This means that individuals with 6 or more prenatal visits have 1.59 times the odds of being at or above any given Apgar category, compared to those with fewer visits. This single OR applies at every cutpoint (0 vs 1+, 0–1 vs 2+, and 0–2 vs 3).

R Activity: multinomial and proportional-odds models in R

The course dataset phaa_survey_clean.csv has both an unordered multi-category variable (region, 5 Lower-Mainland regions) and an ordered one (education, 5 levels). The full annotated script is in r-activities/HSCI_410_Lesson_6_Ordinal_and_Multinomial_Models.R.

library(nnet);  library(MASS);  library(brant)
phaa <- read.csv("phaa_survey_clean.csv", stringsAsFactors = FALSE)
phaa$region    <- relevel(factor(phaa$region), ref = "Vancouver")
phaa$education <- factor(phaa$education,
  levels = c("Less than high school", "High school", "Some college",
             "Bachelor's", "Graduate degree"),
  ordered = TRUE)

# 1. MULTINOMIAL: nominal outcome (region)
mn <- multinom(region ~ age + gender + smoker, data = phaa, trace = FALSE)
exp(coef(mn))                                           # relative-risk ratios
z <- summary(mn)$coefficients / summary(mn)$standard.errors
round((1 - pnorm(abs(z))) * 2, 3)                          # p-values

# 2. PROPORTIONAL-ODDS: ordered outcome (education)
po <- polr(education ~ age + gender + smoker, data = phaa, Hess = TRUE)
exp(cbind(OR = coef(po), confint(po)))                  # OR + 95% CI

# 3. Brant test: is the proportional-odds assumption defensible?
brant(po)

# 4. Multinomial fit of the SAME ordered outcome, the fall-back to compare
mn_edu <- multinom(education ~ age + gender + smoker, data = phaa, trace = FALSE)
AIC(po, mn_edu)                                        # lower AIC = better-fitting model

Reading the multinomial output. The relative-risk ratios from multinom() compare each region to the reference (Vancouver). For polr(), a single OR per predictor applies between every adjacent pair of education levels; that is the proportional-odds assumption that brant() tests. If the Brant overall p-value is < .05, compare AIC(po, mn_edu) and prefer the multinomial fit when its AIC is clearly lower (or fit a partial-PO model in the stretch).

R Reflect on what you just ran

Use the questions below to interpret the output you produced. Look at your console / plot before answering.

1. From exp(coef(mn)) and the matching p-value matrix, pick one non-reference region (anything but Vancouver) and one predictor (e.g., smokerYes). Report the relative-risk ratio, and state in one sentence what it says about the relative likelihood of living in that region versus Vancouver.

Model answerPicking smokerYes in the Fraser-Valley vs. Vancouver contrast as an example: the relative-risk ratio (RRR) might be around 1.45 (95% CI varies). Interpretation: smokers have a 45% higher likelihood of living in the Fraser Valley rather than Vancouver, compared with non-smokers, controlling for other covariates in the model. RRR > 1 means the predictor increases the relative likelihood of that category over the reference category.

2. From exp(cbind(OR = coef(po), confint(po))), report the OR for age and its 95% CI. Under the proportional-odds assumption, translate this OR into a sentence about being at-or-above any given education level.

Model answerThe OR for age in the proportional-odds model is typically around 1.02–1.04 (95% CI clearly excluding 1) per year of age. Interpretation under proportional odds: a one-year increase in age is associated with a 2–4% increase in the odds of being at-or-above any given education level. Because the OR is constant across cut-points (the proportional-odds assumption), this single OR summarises the age effect on each cumulative comparison, for example the odds of being “Some college or higher” vs. “HS or less,” or “Bachelor's or higher” vs. “Some college or less.”

3. Look at the brant(po) output. What is the overall p-value, and do any individual predictors flag a violation (p < .05)? Based on this, would you keep po or fall back to mn_edu (compare AICs)?

Model answerbrant() typically returns an overall p-value around 0.01–0.05; if the global is significant, some individual predictors will also show p < .05, indicating violation of the proportional-odds assumption. If violations are flagged: (a) compare AIC of po vs. mn_edu (the multinomial alternative); (b) if mn_edu has substantially lower AIC, prefer the multinomial; (c) alternatively, fit a partial proportional odds model that relaxes the assumption only for the violating predictors. The cleanest practical move when the assumption is violated and the categories are ordered is the partial proportional-odds model; its category-specific slopes let the flagged predictors act differently at each cutpoint while the outcome's ordering is preserved.

Saved.

Testing the Proportional-Odds Assumption

The proportional-odds assumption is that the effect of each predictor is the same at every cutpoint. If violated, the model may give misleading results. Several tests are available:

Testing Proportional Odds

Three main approaches exist:

Approximate LRT: Compare the log-likelihoods of the proportional-odds model and the multinomial model. A significant difference suggests the proportional-odds assumption is violated.
Wolfe-Gould approximate LRT: Based on J−1 separate binary logistic models at each cutpoint. Sum the log-likelihoods and compare to the proportional-odds model.
Brant (Wald) test: Provides both an overall test and individual tests for each predictor, showing which specific variables violate the assumption (Brant, 1990).

Generalised Ordinal Logistic Regression

If the proportional-odds assumption is violated, a generalised ordinal logistic regression model allows separate coefficients at each cutpoint. This model is equivalent to fitting J−1 separate binary logistic regressions simultaneously. It is more flexible but less parsimonious than the proportional-odds model (Williams, 2006).

Partial Proportional-Odds Model

A compromise approach is the partial proportional-odds model, which relaxes the proportional-odds assumption for selected predictors only (those that fail the Brant test) while maintaining it for the rest (Peterson & Harrell, 1990; Williams, 2006). This provides a good balance between flexibility and parsimony. Other alternatives include the stereotype logistic model and the heterogeneous choice logistic model.

⚠ The Proportional-Odds Assumption in Practice

The proportional-odds assumption is often violated in practice, especially with many predictors or when the outcome categories represent very different phenomena. Always test this assumption before reporting results from a proportional-odds model (Brant, 1990). If violated, consider a partial proportional-odds model or generalised ordinal logistic regression (Peterson & Harrell, 1990; Williams, 2006).

Brant Test Results Example

The Brant test provides both an overall test and predictor-specific tests. Here is an example of how results might be presented:

Predictor	χ²	df	P-value	Assumption Holds?
Prenatal visits	2.14	2	0.343	Yes
Maternal age	8.92	2	0.012	No
Parity	1.03	2	0.598	Yes
Overall	12.45	6	0.053	Borderline

In this example, only maternal age violates the assumption. A partial proportional-odds model that allows maternal age to have different effects at each cutpoint (while constraining prenatal visits and parity) would be appropriate.

Regression Diagnostics

Regression diagnostics for the proportional-odds model can be conducted by fitting binary logistic models at each cutpoint and applying the diagnostic techniques from Chapter 16 (residual analysis, influence measures, goodness-of-fit tests).

✎ Reflection

Why do you think the proportional-odds assumption is so often violated in practice? Can you think of a scenario in your own research where you would expect the effect of a predictor to differ across cutpoints?

Model answerThe proportional-odds assumption requires that the effect of every predictor is the same across all cut-points of the ordinal outcome, a strong assumption rarely true in practice. It is often violated because: (a) floor/ceiling effects: predictors may have stronger effects at extreme categories than middle ones; (b) non-linearity in the latent variable: if the underlying construct varies non-linearly, the cut-point ORs differ; (c) category-specific mechanisms: the cause of moving from severe to moderate is different from the cause of moving from mild to none. Example: income's effect on educational attainment may be very strong at high levels (deciding between PhD vs. Master's) and weak at low levels (deciding between any-HS vs. dropout, where structural barriers dominate). Pre-specify Brant tests and have a fallback model (partial proportional odds, generalised ordinal, or multinomial) in the protocol.

✓ Reflection saved!

● Complete the quiz and reflection to continue.

Section 4

Adjacent-Category & Continuation-Ratio Models

⏱ Estimated time: 20 minutes

Section 4 of 4

Adjacent-Category & Continuation-Ratio Models

Two alternatives for when proportional odds does not hold or the ordering is sequential.

Adjacent-category model

Comparing each level to its immediate neighbour

Adjacent-category logit (Eq 17.3)

\[ \ln\frac{\color{#0B7B6B}{P(Y=j)}}{\color{#C2410C}{P(Y=j-1)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1} X \]

P(Y=j) category j P(Y=j−1) category just below β₀ᵃ category intercept β₁ common slope

The coefficient for categories n levels apart equals n times the coefficient for adjacent categories. This constraint is tested by a likelihood-ratio test against the unconstrained multinomial.

Apgar data result

Likelihood-ratio test: χ² = 6.76, df = 5, P = 0.239. The adjacent-category model is a valid simplification of the multinomial for these data.

Continuation-ratio model

For sequential-stage outcomes

Continuation-ratio logit (Eq 17.4)

\[ \ln\frac{\color{#0B7B6B}{P(Y=j)}}{\color{#C2410C}{P(Y < j)}} = \color{#6D28D9}{\beta_0^{(j)}} + \color{#1D4ED8}{\beta_1} X \]

P(Y=j) reaching stage j P(Y<j) stopping earlier β₀ᵃ stage intercept β₁ common slope

Appropriate for

Outcomes where each level is a stage that must be reached before the next: disease progression, exam attempts, career stages.

Not appropriate for

Cross-sectional ordinal outcomes like Apgar scores, where categories are observed simultaneously, not passed through.

Fitting the model

Three binary regressions on recoded data

Original	Y₁ (1 vs 0)	Y₂ (2 vs 0–1)	Y₃ (3 vs 0–2)
0 (stage 0)	0	0	0
1 (stage 1)	1	0	0
2 (stage 2)	–	1	0
3 (stage 3)	–	–	1

Each regression excludes higher-stage observations. A constrained version imposing equal odds ratios can be compared to the unconstrained version via likelihood-ratio test.

Decision guide

Choosing among the four models

Lesson recap

The logit formulation defines the coefficient

Adjacent-category

Single OR for moving one ordinal step. Valid when the unconstrained multinomial is not significantly better.

Continuation-ratio

Stage-specific ORs for advancing past each hurdle. Best for sequential processes; fit as binary logits on recoded data.

The reflection and knowledge check are just below. Spend time on the model-selection scenarios; that is where these distinctions become concrete.

Introduction and Overview

The proportional-odds model is the workhorse for ordered outcomes, but its proportional-odds assumption is genuinely restrictive and frequently fails in real data. This section closes the lesson with two alternatives that relax that assumption in different ways: the adjacent-category model and the continuation-ratio model. Each is appropriate for a different kind of ordering and a different research question.

Learning Objectives

Specify the adjacent-category model and test its constraint against an unconstrained multinomial fit.
Specify the continuation-ratio model and recognise the sequential-stage outcomes it suits.
Fit a continuation-ratio model as a series of binary logistic regressions on recoded data.
Choose between proportional-odds, adjacent-category, and continuation-ratio models based on the question and the data.

Adjacent-Category Model

The adjacent-category model compares the probability of being in category j versus category j−1 (the next lower category). It is a constrained version of the multinomial logistic model: the constraint is that the coefficient for categories n levels apart equals n times the coefficient for adjacent categories.

Like the proportional-odds model, the adjacent-category model estimates a single β₁ per predictor, making it more parsimonious than the unconstrained multinomial model. The validity of this constraint can be tested by comparing the adjacent-category model to the unconstrained multinomial model using a likelihood-ratio test (LRT) (Ananth & Kleinbaum, 1997).

📋 Example: Apgar scores, adjacent-category model

For the Apgar score data, the LRT comparing the adjacent-category model to the unconstrained multinomial model yielded χ² = 6.76, df = 5, P = 0.239. Since this is not significant, the adjacent-category model is a valid simplification of the multinomial model for these data.

Continuation-Ratio Model

The continuation-ratio model compares the probability of being in category j versus all lower categories combined. It is particularly useful when the outcome represents sequential stages that must be “passed through” to reach higher levels (Ananth & Kleinbaum, 1997).

When Is the Continuation-Ratio Model Appropriate?

The continuation-ratio model is ideal for outcomes where each level must be reached before the next can be attained. Examples include: number of attempts to pass an exam, stages of disease progression where remission must occur before relapse, or sequential rounds of a selection process. It is NOT appropriate when movements between categories are not sequential (e.g., Apgar scores, where a baby does not “pass through” each score level).

Fitting the Continuation-Ratio Model

The continuation-ratio model can be fit as a series of separate binary logistic regressions with a recoded outcome variable. For each comparison:

Y = 1 for the level of interest
Y = 0 for all lower levels
Observations at higher levels are excluded (treated as missing)

Consider an example with 4 categories representing the number of attempts to gain admission to medical school (1, 2, 3, 4+):

Original Category	Y₁ (1 vs 0)	Y₂ (2 vs 0–1)	Y₃ (3 vs 0–2)
0 (1 attempt)	0	0	0
1 (2 attempts)	1	0	0
2 (3 attempts)	–	1	0
3 (4+ attempts)	–	–	1

You can fit either a constrained version (equal ORs across levels, tested by LRT) or an unconstrained version (separate ORs at each level). The constrained version is more parsimonious and can be compared to the unconstrained version using a likelihood-ratio test.

Adjacent-Category ModelClick to explore

Continuation-Ratio ModelClick to explore

Choosing Between ModelsClick to explore

When to Use the Adjacent-Category Model

The adjacent-category model is appropriate when the comparison of interest is between neighbouring categories of an ordinal outcome. It is a natural choice when you believe the effect of a predictor operates by shifting individuals one category at a time. The model can be validated by comparing it to the unconstrained multinomial model via LRT.

When to Use the Continuation-Ratio Model

The continuation-ratio model is most appropriate when the outcome represents sequential stages that must be passed through in order. Each category must be reached before the next can be attained. Examples include: successive attempts at an exam, sequential rounds of treatment, or stages of career advancement. If categories can be reached without passing through lower levels, this model is not appropriate.

Comparing Models with LRT

When one model is a constrained (nested) version of another, the likelihood-ratio test can be used to compare them. The test statistic is −2(lnL_constrained − lnL_{unconstrained}), which follows a χ² distribution with degrees of freedom equal to the difference in the number of parameters. A significant result suggests the constraint is not valid and the more complex model is needed.

Decision Guide: Choosing Among the Four Models

Step 1: Is the outcome nominal or ordinal? If nominal, use multinomial logistic regression.
Step 2: If ordinal, does the outcome represent sequential stages? If yes, consider the continuation-ratio model.
Step 3: If not sequential, fit the proportional-odds model and test the assumption. If it holds, use proportional-odds.
Step 4: If the proportional-odds assumption fails, consider the adjacent-category model, partial proportional-odds, or generalised ordinal logistic regression.
Step 5: Compare nested models using LRT to select the most parsimonious adequate model.

✎ Reflection

Can you think of an example from public health or epidemiology where a continuation-ratio model would be more appropriate than a proportional-odds model? What makes the outcome sequential in your example?

Model answerExample: TB disease progression, from latent infection → primary TB → chronic TB → death. Each stage is a hurdle: you cannot get to stage 3 without progressing through stages 1 and 2. A continuation-ratio model estimates the probability of progressing past each stage conditional on having reached it; predictors can have different effects on each transition (e.g., HIV co-infection might strongly increase risk of progression from latent to active disease but have less effect on progression from active to chronic). The proportional-odds model wouldn't capture this because it assumes a single underlying latent severity with uniform predictor effects across cut-points; the continuation-ratio model explicitly models the sequential hurdle structure. Other examples: cancer staging (in-situ → local → regional → distant), pregnancy outcomes (no pregnancy → early loss → mid-trimester loss → preterm birth → term).

✓ Reflection saved!

● Complete the quiz and reflection to continue.

HSCI 410 · Lesson 6

Exploratory Data Analysis For Epidemiology

Modelling Ordinal & Multinomial Data

Learning objectives for this lesson:

Glossary: Key Terms, People & Concepts

Introduction & Overview of Models

Modelling Ordinal & Multinomial Data

Introduction & Overview of Models

Nominal vs ordinal outcomes

Nominal

Ordinal

Apgar scores and prenatal visits

What each model compares

Multinomial logit

Proportional-odds (cumulative logit)

Adjacent-category logit

Continuation-ratio logit

Logit formulations side by side

What to take into the next section

Introduction and Overview

Learning Objectives

When Outcomes Have More Than Two Categories

The Apgar Score Example

Overview of the Four Models

✎ Reflection

Multinomial Logistic Regression

Multinomial Logistic Regression

J minus 1 simultaneous logits

How the multinomial probabilities fit together

Relative risk ratios in the Apgar data

Independence of irrelevant alternatives

When it matters

When it rarely matters

What to take into the next section

Introduction and Overview

Learning Objectives

The Multinomial Logistic Model

Predicted Probabilities

Interpreting Odds Ratios

Predicted Probabilities

Independence of Irrelevant Alternatives (IIA)

Note on IIA Tests

Alternative-Specific Data

✎ Reflection

Proportional-Odds Model

Proportional-Odds Model

A latent continuous variable divided by cutpoints

Cumulative logit: one slope across all cutpoints

The Brant test

Remedies: partial and generalised models

Partial proportional-odds

Generalised ordinal logit

Fall back to multinomial

What to take into the next section

Introduction and Overview

Learning Objectives

The Most Common Ordinal Model

The Proportional-Odds Logit

R Reflect on what you just ran

Testing the Proportional-Odds Assumption

⚠ The Proportional-Odds Assumption in Practice

Brant Test Results Example

Regression Diagnostics

✎ Reflection

Adjacent-Category & Continuation-Ratio Models

Adjacent-Category & Continuation-Ratio Models

Comparing each level to its immediate neighbour

Apgar data result

For sequential-stage outcomes

Appropriate for

Not appropriate for

Three binary regressions on recoded data

Choosing among the four models

The logit formulation defines the coefficient

Adjacent-category

Continuation-ratio

Introduction and Overview

Learning Objectives

Adjacent-Category Model

Continuation-Ratio Model