Modelling Ordinal & Multinomial Data
Exploratory Data Analysis For Epidemiology
Kiffer G. Card, PhD, Faculty of Health Sciences, Simon Fraser University
Learning objectives for this lesson:
- Select an appropriate model (multinomial, proportional-odds, adjacent-category, or continuation-ratio) based on study objectives and data
- Fit all of the models listed above
- Evaluate the assumptions on which each model is based
- Interpret OR estimates from each model
- Compute predicted probabilities from each model
This course was developed by Kiffer G. Card, PhD, as a companion to Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.
Introduction & Overview of Models
When Outcomes Have More Than Two Categories
In many epidemiological studies the outcome variable has more than two categories. These outcomes fall into two broad types: nominal data, where the categories have no natural ordering (e.g., type of disease, preferred clinic), and ordinal data, where the categories are ordered (e.g., pain severity: none, mild, moderate, severe).
The choice of model depends on whether the outcome is nominal or ordinal. Nominal data require multinomial logistic regression or log-linear models. Ordinal data can be analysed with the same multinomial model (ignoring the ordering), but more efficient approaches exploit the ordering: proportional-odds, adjacent-category, and continuation-ratio models.
The Apgar Score Example
Throughout Chapter 17, the authors use Apgar scores as a running example. Apgar scores (measured at birth) are recoded into four ordinal categories. The research question is whether the number of prenatal visits is associated with Apgar score category.
| Apgar Category | Code | Prenatal Visits < 6 | Prenatal Visits ≥ 6 | Total |
|---|---|---|---|---|
| 1–6 (Low) | 0 | 47 | 25 | 72 |
| 7 | 1 | 48 | 42 | 90 |
| 8 | 2 | 59 | 72 | 131 |
| 9–10 (High) | 3 | 134 | 227 | 361 |
| Total | 288 | 366 | 654 |
Overview of the Four Models
Each of the four models for multi-category outcomes uses a different formulation of the logit (log-odds). Understanding the logit structure is the key to understanding each model.
Compares each outcome category to a baseline category. For J categories, the model estimates J−1 sets of coefficients. Each set describes how predictors relate to the log-odds of being in category j versus the baseline.
No assumptions about ordering are made, so this model is appropriate for both nominal and ordinal outcomes (though it is less efficient for ordinal data).
Based on cumulative probabilities. The logit compares the probability of being at or above category j versus below it. A single coefficient per predictor applies at every cutpoint—the proportional-odds assumption.
This is the most common ordinal logistic model and is more parsimonious than the multinomial model.
Compares each category to the adjacent (next lower) category. This model is a constrained version of the multinomial model where the coefficient for categories n levels apart equals n times the coefficient for adjacent categories.
Like the proportional-odds model, it estimates a single β1 per predictor.
Compares each category to all lower categories combined. This model is especially appropriate when the outcome represents sequential stages that must be “passed through” to reach higher levels (e.g., number of attempts to achieve certification).
Can be fit as a series of separate binary logistic regressions with appropriately recoded outcome variables.
✔ Check Your Understanding
1. Nominal outcome data differ from ordinal outcome data in that:
2. How many sets of coefficients does a multinomial logistic model estimate for J outcome categories?
3. Which model assumes the effect of a predictor is the same across all cutpoints?
✎ Reflection
Think of an ordinal outcome variable from your own field of study. What are the categories, and which of the four models introduced here do you think would be most appropriate? Why?
Multinomial Logistic Regression
The Multinomial Logistic Model
The multinomial logistic model simultaneously fits J−1 separate logistic models, each comparing one category to a chosen baseline. All parameters are estimated jointly, so the model accounts for the correlation among the comparisons.
Predicted Probabilities
The predicted probability for each outcome category is computed from the set of linear predictors. Let Xβ(j) denote the linear predictor for category j.
These probabilities always sum to 1 across all categories. Each predicted probability depends on all sets of coefficients, not just the coefficients for that category.
Interpreting Odds Ratios
Exponentiated coefficients from a multinomial model are technically ratios of relative risks (RRR), not true odds ratios. Each exp(β(j)) gives the ratio of the probability of being in category j relative to the baseline, for a one-unit change in the predictor.
Because the multinomial model estimates separate coefficients for each comparison, the effects can differ across categories. For ordinal outcomes, you would typically expect a gradient—more pronounced effects for the categories furthest from the baseline.
Predicted Probabilities
Predicted probabilities from a multinomial model vary by the values of all predictors. To communicate results, it is often useful to compute predicted probabilities at specific covariate patterns (e.g., prenatal visits < 6 vs ≥ 6) and present them in a table or graph.
Testing significance can be done with Wald tests (for individual coefficients) or likelihood-ratio tests (for overall effects). Because the multinomial model has J−1 coefficients per predictor, an overall LRT that tests all J−1 simultaneously is generally preferred over examining individual coefficients.
Independence of Irrelevant Alternatives (IIA)
The multinomial logistic model assumes IIA: the odds of choosing one category over another are independent of what other categories are available. If this assumption is violated, adding or removing a category would change the odds between the remaining categories.
Two tests are available: the Hausman-McFadden test and the Small-Hsiao test. However, these tests often give conflicting results, and IIA violations are primarily a concern for nominal data where alternatives are genuinely substitutable (e.g., choosing a mode of transport). For ordinal data, IIA is rarely a practical concern.
In the Apgar score example, the multinomial model (with category 3 [9–10] as baseline) produced the following key results for prenatal visits (≥6 vs <6):
- Category 0 vs 3: OR = 0.24 — those with ≥6 visits have 76% lower relative risk of a low Apgar score
- Category 1 vs 3: OR = 0.65 — 35% lower relative risk of Apgar = 7
- Category 2 vs 3: OR = 0.72 — 28% lower relative risk of Apgar = 8
The gradient (0.24 → 0.65 → 0.72) shows the strongest effect for the lowest Apgar category, as expected for an ordinal outcome.
The Hausman-McFadden and Small-Hsiao tests for IIA often give conflicting results and are not always reliable. In practice, IIA is mainly a concern for nominal (unordered) outcomes with genuinely substitutable alternatives. For ordinal outcomes, it is rarely problematic. Regression diagnostics can be performed by fitting ordinary logistic models for pairs of categories and using standard diagnostic techniques.
Alternative-Specific Data
In some situations, predictors may vary across alternatives rather than (or in addition to) varying across observations. For example, in a study of clinic choice, the distance to each clinic varies by alternative. Special formulations of the multinomial model (conditional logit or mixed logit) accommodate such alternative-specific data.
✔ Check Your Understanding
1. In multinomial logistic regression, the exponentiated coefficients represent:
2. The IIA assumption states that:
3. How are predicted probabilities computed from a multinomial model?
✎ Reflection
Consider the Apgar score example. Why do you think the OR for the lowest Apgar category (0.24) is more extreme than for the middle categories? What does this gradient tell us about prenatal care and birth outcomes?
Proportional-Odds Model
The Most Common Ordinal Model
The proportional-odds model (also called the cumulative logit model or ordinal logistic regression) is the most widely used model for ordinal outcomes. It is based on the idea of an underlying continuous latent variable that is divided into the observed ordinal categories by a series of cutpoints.
The latent variable Si is divided by cutpoints (τ1, τ2, …, τJ−1) into J observed categories. If Si falls between τj−1 and τj, the observation is classified into category j.
The Proportional-Odds Logit
The model takes the form of a cumulative logit: logit(p(Y ≥ j)) = β0j + βX. The key feature is that the intercept varies across cutpoints (giving parallel lines on a logit scale) but the slope coefficients are the same for every cutpoint. This means a single OR summarises the effect of each predictor across all levels of the outcome.
In the Apgar score example, the proportional-odds model yields an OR of 1.59 for prenatal visits (≥6 vs <6). This means that individuals with 6 or more prenatal visits have 1.59 times the odds of being at or above any given Apgar category, compared to those with fewer visits. This single OR applies at every cutpoint (0 vs 1+, 0–1 vs 2+, and 0–2 vs 3).
Testing the Proportional-Odds Assumption
The proportional-odds assumption is that the effect of each predictor is the same at every cutpoint. If violated, the model may give misleading results. Several tests are available:
Three main approaches exist:
- Approximate LRT: Compare the log-likelihoods of the proportional-odds model and the multinomial model. A significant difference suggests the proportional-odds assumption is violated.
- Wolfe-Gould approximate LRT: Based on J−1 separate binary logistic models at each cutpoint. Sum the log-likelihoods and compare to the proportional-odds model.
- Brant (Wald) test: Provides both an overall test and individual tests for each predictor, showing which specific variables violate the assumption.
If the proportional-odds assumption is violated, a generalised ordinal logistic regression model allows separate coefficients at each cutpoint. This model is equivalent to fitting J−1 separate binary logistic regressions simultaneously. It is more flexible but less parsimonious than the proportional-odds model.
A compromise approach is the partial proportional-odds model, which relaxes the proportional-odds assumption for selected predictors only (those that fail the Brant test) while maintaining it for the rest. This provides a good balance between flexibility and parsimony. Other alternatives include the stereotype logistic model and the heterogeneous choice logistic model.
The proportional-odds assumption is often violated in practice, especially with many predictors or when the outcome categories represent very different phenomena. Always test this assumption before reporting results from a proportional-odds model. If violated, consider a partial proportional-odds model or generalised ordinal logistic regression.
Brant Test Results Example
The Brant test provides both an overall test and predictor-specific tests. Here is an example of how results might be presented:
| Predictor | χ² | df | P-value | Assumption Holds? |
|---|---|---|---|---|
| Prenatal visits | 2.14 | 2 | 0.343 | Yes |
| Maternal age | 8.92 | 2 | 0.012 | No |
| Parity | 1.03 | 2 | 0.598 | Yes |
| Overall | 12.45 | 6 | 0.053 | Borderline |
In this example, only maternal age violates the assumption. A partial proportional-odds model that allows maternal age to have different effects at each cutpoint (while constraining prenatal visits and parity) would be appropriate.
Regression Diagnostics
Regression diagnostics for the proportional-odds model can be conducted by fitting binary logistic models at each cutpoint and applying the diagnostic techniques from Chapter 16 (residual analysis, influence measures, goodness-of-fit tests).
✔ Check Your Understanding
1. The proportional-odds model assumes:
2. If the proportional-odds assumption is violated for some but not all predictors, which model can be used?
3. The latent variable in a proportional-odds model represents:
✎ Reflection
Why do you think the proportional-odds assumption is so often violated in practice? Can you think of a scenario in your own research where you would expect the effect of a predictor to differ across cutpoints?
Adjacent-Category & Continuation-Ratio Models
Adjacent-Category Model
The adjacent-category model compares the probability of being in category j versus category j−1 (the next lower category). It is a constrained version of the multinomial logistic model: the constraint is that the coefficient for categories n levels apart equals n times the coefficient for adjacent categories.
Like the proportional-odds model, the adjacent-category model estimates a single β1 per predictor, making it more parsimonious than the unconstrained multinomial model. The validity of this constraint can be tested by comparing the adjacent-category model to the unconstrained multinomial model using a likelihood-ratio test (LRT).
For the Apgar score data, the LRT comparing the adjacent-category model to the unconstrained multinomial model yielded χ² = 6.76, df = 5, P = 0.239. Since this is not significant, the adjacent-category model is a valid simplification of the multinomial model for these data.
Continuation-Ratio Model
The continuation-ratio model compares the probability of being in category j versus all lower categories combined. It is particularly useful when the outcome represents sequential stages that must be “passed through” to reach higher levels.
The continuation-ratio model is ideal for outcomes where each level must be reached before the next can be attained. Examples include: number of attempts to pass an exam, stages of disease progression where remission must occur before relapse, or sequential rounds of a selection process. It is NOT appropriate when movements between categories are not sequential (e.g., Apgar scores, where a baby does not “pass through” each score level).
Fitting the Continuation-Ratio Model
The continuation-ratio model can be fit as a series of separate binary logistic regressions with a recoded outcome variable. For each comparison:
- Y = 1 for the level of interest
- Y = 0 for all lower levels
- Observations at higher levels are excluded (treated as missing)
Consider an example with 4 categories representing the number of attempts to gain admission to medical school (1, 2, 3, 4+):
| Original Category | Y1 (1 vs 0) | Y2 (2 vs 0–1) | Y3 (3 vs 0–2) |
|---|---|---|---|
| 0 (1 attempt) | 0 | 0 | 0 |
| 1 (2 attempts) | 1 | 0 | 0 |
| 2 (3 attempts) | — | 1 | 0 |
| 3 (4+ attempts) | — | — | 1 |
You can fit either a constrained version (equal ORs across levels, tested by LRT) or an unconstrained version (separate ORs at each level). The constrained version is more parsimonious and can be compared to the unconstrained version using a likelihood-ratio test.
The adjacent-category model is appropriate when the comparison of interest is between neighbouring categories of an ordinal outcome. It is a natural choice when you believe the effect of a predictor operates by shifting individuals one category at a time. The model can be validated by comparing it to the unconstrained multinomial model via LRT.
The continuation-ratio model is most appropriate when the outcome represents sequential stages that must be passed through in order. Each category must be reached before the next can be attained. Examples include: successive attempts at an exam, sequential rounds of treatment, or stages of career advancement. If categories can be reached without passing through lower levels, this model is not appropriate.
When one model is a constrained (nested) version of another, the likelihood-ratio test can be used to compare them. The test statistic is −2(lnLconstrained − lnLunconstrained), which follows a χ² distribution with degrees of freedom equal to the difference in the number of parameters. A significant result suggests the constraint is not valid and the more complex model is needed.
Step 1: Is the outcome nominal or ordinal? If nominal, use multinomial logistic regression.
Step 2: If ordinal, does the outcome represent sequential stages? If yes, consider the continuation-ratio model.
Step 3: If not sequential, fit the proportional-odds model and test the assumption. If it holds, use proportional-odds.
Step 4: If the proportional-odds assumption fails, consider the adjacent-category model, partial proportional-odds, or generalised ordinal logistic regression.
Step 5: Compare nested models using LRT to select the most parsimonious adequate model.
✔ Check Your Understanding
1. In the adjacent-category model, the coefficient for categories n levels apart is:
2. The continuation-ratio model is most appropriate when:
3. If an LRT comparing the adjacent-category model to the multinomial model is NOT significant, this suggests:
✎ Reflection
Can you think of an example from public health or epidemiology where a continuation-ratio model would be more appropriate than a proportional-odds model? What makes the outcome sequential in your example?
Lesson 5 — Final Assessment
This assessment covers all sections of Lesson 5. You must answer all 15 questions correctly to complete the lesson. Read each question carefully and review the feedback for any incorrect answers before retrying.
✎ Final Reflection
Now that you have completed all four sections, summarise the key differences among the four models for multi-category outcomes. When would you choose each one, and what assumptions would you need to verify?
✔ Final Assessment
1. What type of outcome data has categories with no natural ordering?
2. How many sets of coefficients does a multinomial model with 4 outcome categories estimate?
3. In a proportional-odds model, the OR for a predictor represents:
4. The logit in a proportional-odds model is based on:
5. The IIA assumption in multinomial logistic regression means:
6. If the proportional-odds assumption is clearly violated, a good alternative is:
7. In a multinomial model, exponentiated coefficients are technically:
8. The Brant test evaluates:
9. The adjacent-category model is a constrained version of:
10. Continuation-ratio models are best suited for:
11. The latent variable in a proportional-odds model has cutpoints (τ) that:
12. To compare nested ordinal models, you should use:
13. If the LRT comparing proportional-odds to multinomial models is significant:
14. In a continuation-ratio model, observations at higher levels than the one being modelled are:
15. Which model is most parsimonious for ordinal data when the proportional-odds assumption holds?