# Lesson 11 — Analytic Induction, QCA, and Decision Models (v3 expanded)

*Companion-podcast transcript • Sarah & Kiffer*  
*~4801 words • ~26.7 min audio*

---

**Sarah:** Welcome back to Office Hours. I'm Sarah.

**Kiffer:** And I'm Kiffer. Today is Lesson eleven, Analytic Induction, Q C A, and Decision Models. And this is a lesson with a particular flavor compared to the rest of the course. Most of what we've done has been variable-oriented in a loose sense. Looking at codes, themes, and concepts across transcripts. This lesson takes a different turn. All three methods today are case-oriented in a strong, explicit sense. The unit of analysis is the case. The analytic move is comparison across cases. And the goal is defensible causal or quasi-causal inference from a small number of cases.

**Sarah:** Right. And this is the part of the qualitative toolkit that does work the standard epidemiology methods don't do well. Statistical regression assumes you have enough cases for the central limit theorem to do its work, and that the causal structure of the world is best approximated by linear, additive, probabilistic relationships. For many public-health questions that assumption is fine. For questions about decision processes, configurations of conditions, and outcomes that depend on combinations of factors rather than net effects, it's wrong.

**Kiffer:** And the three methods today, analytic induction, qualitative comparative analysis, and ethnographic decision modeling, are the methodologically defensible alternatives. Let's go through them in order, then talk about how this all lands in the week eleven milestone.

**Sarah:** Okay. Analytic induction first. The oldest of the three.

**Kiffer:** Right. Formulated by the Polish-American sociologist Florian Znaniecki in nineteen thirty-four in his book The Method of Sociology. Demonstrated by Alfred Lindesmith on opiate addiction in nineteen forty-seven, and applied by Donald Cressey to financial trust violation in nineteen fifty-three. Analytic induction is the ancestor of all the case-oriented qualitative-causal methods that follow.

**Sarah:** And its central claim is methodologically aggressive. A defensible qualitative hypothesis must account for every case in the dataset, and a single counter-case is enough to force revision.

**Kiffer:** Yeah. Walk through the procedure. Six steps. First, define the phenomenon to be explained tightly enough that you can decide whether any given case is or is not an instance. Loneliness is too loose. Sustained loneliness lasting at least six months that the participant explicitly names as such is tight enough. Second, formulate a hypothesis naming the conditions under which the phenomenon occurs. Third, examine one case. Does the hypothesis fit?

**Sarah:** Fourth, if it fits, move to another case. The fit cases don't confirm the hypothesis. They merely fail to disconfirm it. The work of the method is in the disconfirmations.

**Kiffer:** Fifth, when a non-fit case is found, do one of two things. Either revise the hypothesis to accommodate the new case, or redefine the phenomenon to exclude it. Both are legitimate moves but they are not equivalent. Revising the hypothesis broadens explanatory scope. Redefining the phenomenon narrows it. And sixth, continue until all cases in the dataset fit the hypothesis under the current definition. The terminal state is a definition-plus-hypothesis pair that explains one hundred percent of the cases.

**Sarah:** And the procedure produces, in principle, a statement of necessary and sufficient conditions. Whenever the conditions hold, the phenomenon occurs. And whenever the phenomenon occurs, the conditions hold. That's a much stronger logical claim than probabilistic association.

**Kiffer:** Right. Most quantitative methods aim only at probabilistic association. Analytic induction aims at necessary-and-sufficient relationships. That's what makes it methodologically aggressive.

**Sarah:** Compare it to grounded theory, which we covered earlier. Both are comparison-driven.

**Kiffer:** Yeah. Grounded theory's comparisons feed concept development. Analytic induction's comparisons test a propositional hypothesis. In grounded theory, the negative case is a tool for refining a category. In analytic induction, the negative case is the engine of the entire method. The case that doesn't fit is the case that does the analytic work, because it forces revision.

**Sarah:** Let's walk through the Lindesmith example, because it's the classical demonstration.

**Kiffer:** Lindesmith began with the prevailing hypothesis of the nineteen forties. Opiate addiction was a function of the pleasurable euphoria opiates produce. Addicts continued using because they sought the high. He interviewed addicts, a fully qualitative dataset by today's standards, and found cases that didn't fit. Some people who experienced significant euphoria from opiates didn't become addicted. Some hospitalized patients who received heavy opiate doses for medical reasons didn't become addicted even when they experienced withdrawal symptoms.

**Sarah:** So he revised the hypothesis.

**Kiffer:** Right. His new claim was that addiction develops when a person uses opiates, experiences withdrawal, recognizes the withdrawal as caused by the absence of the drug, and uses the drug specifically to relieve withdrawal. The hospitalized patients who didn't become addicted hadn't recognized the withdrawal-drug connection, because they didn't know what they were receiving. The cognitive recognition step, the conscious linking of withdrawal to the absence of the substance, was, Lindesmith argued, the necessary and sufficient condition.

**Sarah:** And Cressey's work on embezzlers is the parallel demonstration.

**Kiffer:** Yeah. Cressey interviewed one hundred thirty-three incarcerated trust violators in three federal prisons. Through repeated revisions, he arrived at a three-part necessary-and-sufficient condition. Trust violation occurs when the person has a non-shareable financial problem, has knowledge of how trust violation could solve the problem, and can rationalize the violation as something other than trust violation. The rationalization step was what made the work famous in criminology. Cases that lacked an available rationalization did not proceed to trust violation even when problems and knowledge were present.

**Sarah:** Let me try to make this concrete on our loneliness dataset. We can walk a small analytic induction through it. The phenomenon to be explained is professional help-seeking for loneliness. Defined as the participant describes consulting a clinician, therapist, or counselor explicitly because of loneliness or its symptoms.

**Kiffer:** Right. Initial hypothesis. Loneliness leads to professional help-seeking. Read across the twenty transcripts. Maya, twenty-two, doesn't seek professional help despite reporting significant loneliness. Counter-case at case one. Revision. Loneliness leads to professional help-seeking when it reaches a clinical threshold, sleep disruption, functional impairment. Diana, an early-career professional with documented insomnia, doesn't seek help. She explicitly describes not wanting to be the kind of person who pays someone to listen to them. Counter-case at case four.

**Sarah:** Revision. Loneliness leads to professional help-seeking when it reaches a clinical threshold AND the participant doesn't hold strong cultural prohibitions against professional help.

**Kiffer:** Then Marcus, an older man who's lost his wife, fits the revised hypothesis. He seeks help through his church, his G P, and gets referred to grief counseling. But then Margaret, a working-class single mother with severe loneliness and no cultural prohibitions, doesn't seek help because she can't afford counseling and the wait-list for publicly funded service is nine months. Counter-case at case twelve.

**Sarah:** Revision. Professional help-seeking requires clinical threshold AND cultural permission AND permeable structural gates. Cost, wait-list, geography, language.

**Kiffer:** Continue across remaining transcripts. The hypothesis withstands the next several but fails on Amira, the Syrian refugee, who meets all three conditions but doesn't seek help because she doesn't trust that the system will treat her confidentially. She fears immigration consequences. Add a fourth condition. The participant trusts the system not to inflict secondary harms.

**Sarah:** So you end up with a four-part conjunctive hypothesis that fits every case in your dataset. That's how analytic induction proceeds. The conditions are jointly necessary and sufficient for the phenomenon as you've come to define it.

**Kiffer:** Right. But you can also redefine the phenomenon at any point. For example, restricting it to participants for whom professional help is even imaginable as an option would let you drop the trust condition. Either move is defensible. Both should be documented in the audit trail.

**Sarah:** Now the critique. W S Robinson in nineteen fifty-one wrote The Logical Structure of Analytic Induction, which is the most influential critique.

**Kiffer:** Yeah. Robinson argued that analytic induction doesn't actually produce what it claims to produce, for two reasons. First, analytic induction only examines cases in which the phenomenon occurs. Lindesmith interviewed addicts. He didn't interview a comparison group of non-addicts. Cressey interviewed trust violators. He didn't interview a matched sample of people in similar positions who didn't violate trust. Without negative cases, cases where the conditions are present but the phenomenon doesn't occur, you can't establish sufficiency. You can only establish necessity. The method, as practiced, gives you necessary conditions but not sufficient ones.

**Sarah:** And the second critique.

**Kiffer:** The revision step is unconstrained. When you find a counter-case, you can always revise the hypothesis or redefine the phenomenon to accommodate it. There's no logical limit on how baroque the resulting formula can become. A sufficiently committed analyst can always tune the hypothesis to fit any finite case-set. The resulting statement is therefore not a universal generalization but a description of this particular case-set.

**Sarah:** And the contemporary response.

**Kiffer:** Twofold. Treat analytic induction as a heuristic for hypothesis generation, not as a logically deductive proof procedure. The hypothesis you produce is a candidate for further testing, ideally including negative-case sampling that addresses Robinson's first objection. And pre-register the hypothesis revision rules to constrain the second concern.

**Sarah:** And the loneliness capstone dataset actually has both kinds of cases. Participants who do and don't seek professional help. So if students do analytic induction for week eleven, they're not running a Robinson-vulnerable instance-only procedure. They're sampling for variation on both the outcome and the conditions.

**Kiffer:** Right. Document that move in the memo. Because it's a real methodological advantage.

**Sarah:** Okay. Move us to Q C A. Qualitative comparative analysis. Charles Ragin's method.

**Kiffer:** Q C A was developed by Charles Ragin in The Comparative Method in nineteen eighty-seven and elaborated in Fuzzy-Set Social Science in two thousand and in Redesigning Social Inquiry in two thousand eight. The central insight is that many real-world causal stories are not about net effects of single variables, which is the regression idiom, but about combinations of conditions that are jointly sufficient for an outcome. A risk factor that's irrelevant on its own can be essential in combination with another. A protective factor that works in one configuration can be neutralized in another.

**Sarah:** And Q C A gives you a disciplined, Boolean-algebra-based way to find and report those configurations.

**Kiffer:** Right. To motivate it, contrast with logistic regression. You run a logistic regression of a binary outcome on three predictors. You get three coefficients, each representing the net effect of one variable holding the others constant. The model assumes the effects are additive on the log-odds scale. It treats every case as a draw from a population with that net-effect structure.

**Sarah:** And many causal structures in public health don't look like that.

**Kiffer:** Right. Consider obesity policy. A school nutrition program might reduce childhood obesity only when the local food environment supports it, AND parental income is above a threshold, AND the school has stable administrative leadership. The program alone has no effect. The food environment alone has no effect. Income alone has no effect. The three together produce the outcome. A regression with three main effects and three two-way interactions and one three-way interaction can in principle capture this, but only with enough cases and only if you remembered to put the interactions in.

**Sarah:** And Q C A approaches the same problem differently.

**Kiffer:** Each case is represented as a configuration. A vector of present-absent, one-zero values on the conditions. The analyst tabulates how cases distribute across configurations and asks, which configurations consistently produce the outcome? The answer is a Boolean expression. Food environment AND income AND leadership implies outcome reduction. There's no main effect of any single condition. There's one sufficient configuration. Q C A finds it. Regression can't, without prior knowledge of the interaction structure.

**Sarah:** And Ragin names four features of causal structures that Q C A handles well and regression handles poorly. Conjunctural causation, equifinality, causal asymmetry, and limited diversity.

**Kiffer:** Walk through them quickly. Conjunctural causation. The outcome depends on combinations of conditions, not on single variables. Equifinality. There's more than one combination of conditions sufficient for the outcome. Multiple paths. Causal asymmetry. The conditions producing the presence of the outcome are different from the conditions producing its absence. The negation is not just the inverse. And limited diversity. There aren't enough cases to populate every theoretically possible configuration, so the method must explicitly distinguish what the data show from what they can't show.

**Sarah:** The fourth, limited diversity, is what makes Q C A usable with small-N datasets where regression isn't.

**Kiffer:** Right. The fundamental data object in Q C A is the truth table. A row for every theoretically possible configuration of conditions, with a column indicating how many cases of each kind exist and whether they produce the outcome. With K binary conditions there are two-to-the-K rows. Four conditions, sixteen rows. Five conditions, thirty-two rows. Most empirical truth tables are sparse. Many rows have zero cases, reflecting that the social world doesn't produce every combination.

**Sarah:** Let me walk through a toy example on the loneliness dataset. Code each of the twenty transcripts on four binary conditions. B for bereaved, lost a primary attachment figure in the past five years. L for lives alone. I for immigrant. C for has a current caregiving role. And one binary outcome. Y for describes loneliness as existential rather than situational.

**Kiffer:** Right. Each transcript is coded as a single row of zeros and ones. The sixteen possible configurations are summarized in a truth table that shows, for each configuration, how many cases exist and whether the outcome occurs. The rows with zero cases are logical remainders. Theoretically possible configurations that don't appear in the data.

**Sarah:** And logical remainders aren't the same as rows with cases that produced no outcome. They're rows we can't speak to.

**Kiffer:** Right. Ragin's Q C A distinguishes the conservative solution, ignoring remainders. The parsimonious solution, treating remainders as freely available for simplification. And the intermediate solution, allowing simplification only with remainders consistent with theoretical expectations. The three solutions are reported alongside one another.

**Sarah:** And then Boolean minimization is the engine that turns the truth table into a compact expression.

**Kiffer:** Right. The procedure works through pairwise comparison and the application of one logical rule. If two configurations differ on exactly one condition and produce the same outcome, that condition is irrelevant to the outcome in the presence of the other shared conditions, and the two configurations can be combined into a single shorter expression. This is the Quine-McCluskey algorithm in disguise. The same algorithm electrical engineers use to minimize digital logic circuits.

**Sarah:** And the output is something like, existential loneliness occurs when, paren, one is bereaved and lives alone, close paren, OR, paren, one lives alone and is an immigrant, close paren. Two paths, both not involving a current caregiving role.

**Kiffer:** Right. Three things to notice about that result. First, it's a Boolean expression, not a regression coefficient. Second, it identifies multiple sufficient paths. That's equifinality, the feature that distinguishes Q C A most sharply from regression. And third, it distinguishes present from absent for each condition. The absence of caregiving is itself a part of the sufficient configuration, not a missing variable.

**Sarah:** Q C A separates the analysis of necessary conditions from the analysis of sufficient conditions, and uses different metrics for each.

**Kiffer:** Right. Necessity asks, is condition X present in every case where Y occurs? Sufficiency asks, does every case where X is present produce Y? In set-theoretic terms, necessity is set-superset. The set of cases with X contains the set of cases with Y. Sufficiency is set-subset. The set of cases with X is contained in the set of cases with Y.

**Sarah:** And the metrics Q C A uses are consistency and coverage. Walk me through them.

**Kiffer:** Consistency is the fraction of cases with the configuration that produce the outcome. Analogous to positive predictive value in epidemiology. Coverage is the fraction of cases with the outcome that are explained by the configuration. Analogous to sensitivity. The conventional thresholds in Ragin are consistency at point eight zero or above for accepting a sufficiency claim, and consistency at point nine zero for necessity.

**Sarah:** Then there's crisp-set Q C A versus fuzzy-set Q C A.

**Kiffer:** Crisp-set Q C A codes each case as one or zero on each condition. That's fine when the conditions are dichotomous in the world. Alive-dead, vaccinated-unvaccinated, bereaved within five years or not. Fuzzy-set Q C A generalizes the binary coding to a continuous degree-of-membership score between zero and one. A case with income of forty-five thousand dollars might score point four on the high-income set. A case at two hundred thousand scores point nine five. The Boolean operations have set-theoretic analogues for fuzzy sets. Minimum, maximum, one-minus-X. The logic still applies but the consistency and coverage metrics are computed accordingly.

**Sarah:** And for a twenty-transcript capstone, crisp-set Q C A is the appropriate starting point.

**Kiffer:** Right. Twenty transcripts is small enough that the dichotomies are defensible and fuzzy-set calibration would introduce more measurement uncertainty than it removes. The R package called Q C A by Adrian Dusa is the most mature implementation. Free, well documented, produces outputs publishable in Sociological Methods and Research, Implementation Science, or Social Science and Medicine.

**Sarah:** Let me ask the question students always ask. How does Q C A relate to the regression-based causal inference apparatus they've learned in earlier epidemiology coursework?

**Kiffer:** Good question. Q C A does causal inference, but in a different mode. Where standard epidemiology asks, what is the effect of X on Y holding Z constant, Q C A asks, what configurations of X, Y, Z are sufficient for the outcome? The first question presumes that the causal structure is additive in expectation and that the effects are estimable. The second presumes that the causal structure is conjunctural and that the effects of variables depend on their configurations. Neither is universally right. Both are tools that should be in the methodologically omnivorous public-health researcher's kit.

**Sarah:** When does Q C A outperform regression?

**Kiffer:** Small N, ten to fifty cases. The regression coefficients have too little statistical power to be meaningful, but the truth table is fully populated. Strong conjunctural causation. When the effect of X depends critically on Y, a regression with interactions can in principle catch it, but only with prior specification. Q C A finds it without prior specification. Equifinality. When multiple distinct configurations produce the same outcome, Q C A reports all of them. Regression collapses them into an average effect. And asymmetric causation. When the conditions producing the presence of the outcome differ from those producing its absence, Q C A can be run separately on Y and not-Y. Regression assumes the same coefficients apply to both.

**Sarah:** And when does regression outperform Q C A?

**Kiffer:** Large N with continuous outcomes and a known causal structure. Regression's efficiency is unbeatable. Effect-size estimation. Q C A gives sufficiency claims. It doesn't give effect sizes that can be aggregated across studies in a meta-analysis. And counterfactual reasoning over single variables. The potential-outcomes framework operates at the level of individual variables. Q C A at the level of configurations.

**Sarah:** And the framing point for the capstone methods section.

**Kiffer:** Right. If you choose Q C A for week eleven, the methods section should explicitly state that you're doing case-oriented configurational analysis rather than variable-oriented effect estimation. Otherwise readers will read your Q C A results as regression-with-interactions-and-a-tiny-sample and dismiss them. The right framing is, Q C A is the appropriate method when the causal structure is conjunctural and the N is too small for regression. Both conditions hold here. Cite Ragin two thousand eight and Schneider and Wagemann twenty twelve.

**Sarah:** Okay. The third method. Ethnographic decision modeling. Christina Gladwin.

**Kiffer:** Right. Gladwin's nineteen eighty-nine book Ethnographic Decision Tree Modeling. She took the cognitive-anthropology tradition and operationalized it for individual-level decisions. The foundational claim is that human decisions are not the inscrutable outputs of black-box psychology but the products of articulable choice rules that can be elicited, formalized, and tested for predictive accuracy on out-of-sample cases.

**Sarah:** And the appeal for public-health researchers is concrete.

**Kiffer:** Yeah. Decision-tree models predict behavior at the individual level with the kind of mechanistic specificity that variable-oriented regression can't match. The trees are interpretable in a way logistic-regression coefficients aren't. They say, in essentially plain language, what people do under what conditions. When the goal is intervention design, the tree is more useful than the regression because it identifies the specific decision nodes where an intervention could plausibly change the outcome.

**Sarah:** Walk us through Gladwin's procedure. Three phases.

**Kiffer:** Phase one, elicitation. Begin with semi-structured group interviews. The interview centers on hypothetical cases. What would you do if? The cases are designed to probe specific decision dimensions one at a time, holding others constant, varying the dimension of interest, and asking the participant to articulate the threshold at which their answer changes. For example, suppose you have had a cough for three weeks and it's interfering with your sleep. Would you see a doctor? Suppose the nearest clinic is ninety minutes away and you don't have a car. Would you go? Suppose transportation is fine but you don't have a family doctor.

**Sarah:** Phase two, formalization.

**Kiffer:** Cross-case patterning identifies the recurring rules. If most participants say they'll seek care when cough duration is at least two weeks AND transportation is available AND a clinical relationship exists, this is a candidate rule. The analyst formalizes the rule as a sequence of yes-no questions, with branches leading to terminal classifications. Seek care, do not seek care, seek alternative. The most discriminating question goes at the root. Within each branch, the next-most-discriminating question follows. The tree continues until every case in the building set is classified.

**Sarah:** Phase three, testing.

**Kiffer:** The model is tested on a held-out set of cases. Either new participants interviewed about new hypotheticals, or real cases observed in the field. The performance metric is the proportion of cases the tree classifies correctly. Gladwin's convention is that a defensible tree should classify at least eighty to eighty-five percent of out-of-sample cases correctly. Below that, the model is revised.

**Sarah:** And out-of-sample testing is critical. A decision tree built to fit a small set of cases will always achieve high in-sample accuracy. You can keep adding branches until every case is uniquely classified. The methodological discipline is to hold out cases the tree didn't see during construction.

**Kiffer:** Right. Predictive accuracy on those held-out cases is the operational test of whether the tree captures decision rules or merely overfits noise. The iteration continues until the out-of-sample accuracy threshold is met.

**Sarah:** Let's apply it to the loneliness corpus. The focused question is, do participants describe reaching out to their existing personal network when their loneliness becomes severe, or do they describe withdrawing further?

**Kiffer:** Right. Two terminal classifications. Reach out, and withdraw. Reading through the transcripts, several candidate decision dimensions emerge. Does the participant have an existing relationship they describe as close or safe? Has the participant tried reaching out before and felt rebuffed or burdensome? Does the participant frame loneliness as something other people would understand, or as something they'd judge? Is the participant currently in a life-stage where reaching out is socially normal? Does the participant have language to name the loneliness?

**Sarah:** A first-pass tree on twenty transcripts, using twelve for building and eight for testing. Question one. Does the participant have at least one relationship they describe as close or safe? No leads to withdraw. Yes leads to question two. Question two. Has the participant tried reaching out before and felt rebuffed? Yes leads to withdraw. No leads to question three. And so on.

**Kiffer:** Right. Apply the tree to the eight held-out transcripts and count correct classifications. If it gets seven out of eight, eighty-eight percent. Meets Gladwin's threshold. If it gets five out of eight, sixty-three percent. Needs revision. The revision examines the three misclassifications, identifies what feature of those cases the tree is missing, and adds or modifies a branch.

**Sarah:** And the intervention-design payoff is what makes decision-tree modeling continue to have a place in implementation science.

**Kiffer:** Yeah. The tree identifies where to intervene. A logistic regression of vaccine acceptance on predictors tells you that trust-in-clinician is correlated with acceptance. The regression coefficient doesn't tell you how to translate that into program design. A decision tree tells you that for parents who lack a stable clinician relationship, the decision is determined upstream by something else. Cost, friend recommendations, online content. And the intervention should target that upstream node. The tree is mechanistic in a way the regression isn't.

**Sarah:** And decision-tree modeling has been applied broadly. Treatment-seeking, vaccination, cancer screening, medication adherence, contraceptive choice, help-seeking for mental health. The common feature is a sequential decision with a small number of decision nodes.

**Kiffer:** Right. And for the loneliness capstone, decision-tree modeling is a natural fit for the question of whether participants reach out or withdraw, because that's a decision the participants explicitly describe making.

**Sarah:** Let's land the practical piece. The week eleven milestone gives students a choice. Option A is a Q C A truth table with Boolean minimization. Option B is a Gladwin-style decision tree. Both applied to a focused outcome in the corpus, with a seven hundred to nine hundred word interpretive memo.

**Kiffer:** Right. And the principled grounds for choosing are. Choose Q C A if you want to identify sufficient combinations of conditions for an outcome that's a stable case-level attribute. What combinations of bereavement, living arrangements, immigration, caregiving role produce existential versus situational loneliness. Choose decision tree if you want to describe the rule-following structure of a decision the participant explicitly describes making. Reach out versus withdraw, for example.

**Sarah:** And if students can't decide between the two.

**Kiffer:** Write down the outcome and one paragraph naming whether it's an attribute or a decision. The exercise usually resolves which option fits. If still uncertain, choose Q C A. It produces a more compact deliverable and the Boolean-minimization output is more uniformly receivable by an epidemiology audience.

**Sarah:** Okay. Let's pull this together. Seven takeaways.

**Kiffer:** Sure. First, all three methods today are case-oriented in a strong sense. The unit is the case. The analytic move is comparison across cases. The goal is defensible causal or quasi-causal inference from a small number of cases. These are the methods that do work standard regression doesn't do well.

**Sarah:** Second, analytic induction, from Znaniecki in nineteen thirty-four through Lindesmith and Cressey, targets necessary-and-sufficient conditions. A single counter-case forces revision of either the hypothesis or the phenomenon definition. The negative case is the engine.

**Kiffer:** Third, Robinson's nineteen fifty-one critique is that instance-only sampling can establish necessity but not sufficiency, and that the revision step is unconstrained. The contemporary response is to sample for non-instances as well as instances, to pre-register revision rules, and to treat analytic induction as hypothesis-generation rather than logical proof.

**Sarah:** Fourth, Q C A, from Ragin in nineteen eighty-seven, handles conjunctural causation, equifinality, causal asymmetry, and limited diversity. The truth table is the fundamental data object. Boolean minimization produces a compact sufficiency formula. Crisp-set is appropriate for small-N qualitative datasets. Fuzzy-set generalizes to continuous degree-of-membership.

**Kiffer:** Fifth, Q C A separates necessity from sufficiency and uses consistency and coverage as its evidentiary metrics. Consistency at point eight zero or above for sufficiency. Point nine zero for necessity. These map roughly to positive predictive value and sensitivity in epidemiology terms.

**Sarah:** Sixth, ethnographic decision modeling, from Gladwin in nineteen eighty-nine, builds decision trees through elicitation, formalization, and out-of-sample testing. A defensible tree classifies at least eighty to eighty-five percent of held-out cases. The tree is mechanistic and intervention-design-relevant in ways regression coefficients aren't.

**Kiffer:** And seventh, the methodological framing for the capstone methods section matters. If you choose Q C A or analytic induction, state explicitly that you're doing case-oriented configurational analysis or instance-by-instance hypothesis testing, not variable-oriented effect estimation. Cite Ragin two thousand eight and Schneider and Wagemann twenty twelve for Q C A. Cite Lindesmith, Cressey, and Bernard, Wutich, and Ryan for analytic induction. Cite Gladwin for decision trees.

**Sarah:** And one meta-point. The three methods together expand what counts as a qualitative result. From thematic description to formal causal-claim analysis. The Boolean expression a Q C A produces, or the decision tree a Gladwin analysis produces, is recognizable to an epidemiology audience as a result in a way a code list isn't. That's a real expansion of the qualitative researcher's communicative range.

**Kiffer:** Yeah. And it matters for grant writing and for talking to colleagues in adjacent disciplines. The same loneliness dataset can produce a thematic finding, a content-analytic distribution, a grounded-theory model, a narrative typology, a discourse-analytic reading, AND a Q C A sufficiency formula. Methodological omnivory at its best.

**Sarah:** That's a good place to land. Next time, our final lesson. Computational text and large language model analysis. Where we extend everything we've done to the scale of millions of documents and confront the methodological status of L L M-based coders in qualitative research.

**Kiffer:** Right. And we'll look back across the whole arc of the course at that point.

**Sarah:** Thanks for joining us today.

**Kiffer:** Take care everyone. One more to go.
