# Lesson 8 — Content Analysis (v3 expanded)

*Companion-podcast transcript • Sarah & Kiffer*  
*~5000 words • ~27.8 min audio*

---

**Sarah:** Welcome back to Office Hours. I'm Sarah.

**Kiffer:** And I'm Kiffer. Today we're on Lesson eight of our qualitative methods course, Content Analysis. And this is probably the most consequential pivot in this course methodologically, because content analysis is where the qualitative and quantitative traditions stop being separate enterprises and start being two halves of the same method.

**Sarah:** Right. Last week we walked through grounded theory and matrix displays, which are deeply qualitative engines for systematic comparison. Today we're going to keep that comparison spirit but make a specific operational move that pulls the entire apparatus of descriptive and inferential statistics into the qualitative analyst's toolkit.

**Kiffer:** Exactly. The single move is treating codes as variables. Once you make that move, a chi-squared test on a contingency table of codes by subgroup becomes a legitimate analytic step, and a Krippendorff's alpha statistic becomes the reliability standard. We'll spend our time on history first, then design, then the inferential half, then how this lands in the week eight capstone milestone.

**Sarah:** And before we dive in, I want to flag that this lesson is methodologically charged in a way some students don't expect. There are qualitative researchers who view content analysis with suspicion because of how easily it can be done badly, and there are quantitative researchers who view it with suspicion because they don't believe qualitative codes are stable enough to be variables. Both critiques have force, and the way to answer them is the procedural rigor we'll walk through today.

**Kiffer:** That's a really good framing. The history of content analysis is littered with frequency tables built on inconsistent coding of poorly specified units sampled from non-representative corpora. The remedy is procedural. You commit, in writing, to a sampling rule, a unit-definition rule, a coding scheme, and a reliability target before you begin coding. You then execute the procedure transparently and let the reader audit it.

**Sarah:** Okay. Let's start with the history, because content analysis is the oldest systematic method for analyzing text in the social sciences and the lineage really matters.

**Kiffer:** It does. The modern history starts with Harold Lasswell, the political scientist who published Propaganda Technique in the World War in nineteen twenty-seven. Lasswell's question wasn't what does this propaganda mean to its readers, but what categories of appeal does it use and in what proportion? The methodological commitment was to systematic counting of explicit features rather than impressionistic close reading. The payoff was that two analysts working independently could produce comparable numbers.

**Sarah:** And then during the Second World War, Lasswell led the Experimental Division for the Study of War-Time Communications at the U S Library of Congress. That's where the method got its first major public-policy validation.

**Kiffer:** Right. They analyzed German, Italian, and Japanese propaganda in volume that no individual reader could have made sense of impressionistically. By systematic coding of explicit features, the frequency of mentions of specific Allied generals, the proportion of broadcast time devoted to economic versus military themes, the rise and fall of named enemies week by week, the Division produced intelligence assessments that helped guide both counter-propaganda efforts and broader policy decisions. The work was content analysis as inference about a producer. From the text, back to the propagandist's strategic state of mind.

**Sarah:** It's a strange origin story for a method that now dominates health communications research, isn't it? Wartime intelligence.

**Kiffer:** Yeah, it really is. And I think it's worth dwelling on for a moment because the original Lasswellian move, inferring something about the source from systematic features of the text, is methodologically very ambitious. Most contemporary content analyses are doing something narrower. They're describing the text rather than reverse-engineering the producer's intent. But the procedural backbone, systematic counting of carefully defined categories, comes from Lasswell.

**Sarah:** And the post-war codification belongs to Bernard Berelson.

**Kiffer:** Yeah. His nineteen fifty-two book Content Analysis in Communication Research gave the field its first textbook and its most-cited definition. Content analysis is, quote, a research technique for the objective, systematic, and quantitative description of the manifest content of communication. Each of the four adjectives matters. Objective meant repeatable. Systematic meant the same procedure across the entire sample. Quantitative meant counting, not impression. Manifest meant the literal surface of the text, not what the analyst imagined the author meant.

**Sarah:** And Berelson's stance was, in effect, the high-modernist version of content analysis. Positivist, quantitative, and resolutely uninterested in latent meaning.

**Kiffer:** Exactly. And that's the position the field has been arguing with for the last seventy years. The next major recodification came from Klaus Krippendorff, whose book Content Analysis: An Introduction to Its Methodology, first published in nineteen eighty and now in its fourth edition in twenty eighteen, brought the method into the contemporary mainstream. Krippendorff's two most important moves. First, he redefined content analysis as a research technique for making replicable and valid inferences from texts to the contexts of their use. Broadening Berelson's objective description to inference about context made room for latent content. And second, he developed Krippendorff's alpha, which is now the field's reliability standard.

**Sarah:** And then in health research specifically, Hsieh and Shannon in two thousand five articulated a typology that's structured most qualitative-content-analytic work since. Conventional, directed, and summative content analysis.

**Kiffer:** Right. Conventional is inductive. Codes emerge from close reading of the corpus. Directed is deductive. Codes are drawn from theory or prior literature and applied to the data. Summative starts from word counts and works upward to latent meaning. Most contemporary work is hybrid, with an a priori scaffold plus emergent expansion.

**Sarah:** And the contemporary synthesis, the one our textbook adopts, is that the strict separation Berelson drew between qualitative and quantitative analysis was always more rhetorical than real, and that contemporary content analysis is best understood as a method that blends the two.

**Kiffer:** Right. The Bernard-Wutich-Ryan stance, which is our stance, is that you don't have to disavow either tradition. Qualitative judgment determines the coding scheme. Quantitative procedures determine how the codes are distributed and whether differences are reliable. Both halves are doing real work.

**Sarah:** Okay. Now the methodological hinge. Manifest versus latent content. This is the single most-asked question about content analysis.

**Kiffer:** Yeah. The textbook's answer, and the field's answer, is that the choice is yours but it must be explicit. Manifest content is the literal, surface-level content of the text. If your code is mentions of pets, a manifest coder counts every appearance of the words pet, dog, cat, Rufus, parrot. The advantage is that two manifest coders working from a clear word-list will produce nearly identical counts. The disadvantage is that manifest content misses everything the participant is doing with the text other than literal naming.

**Sarah:** And latent content is the underlying meaning that requires interpretation to surface.

**Kiffer:** Right. If your code is loneliness framed as the cost of having loved, a latent coder reads a passage and decides whether the participant is articulating the idea, regardless of whether the specific words cost or love appear. In our running loneliness corpus, Linda's account of her late husband Bill's empty chair is a latent expression of loneliness-as-residue-of-marriage. She never uses that phrase, but the meaning is unmistakable to a competent reader.

**Sarah:** Let me make this concrete with a comparison. Consider the question, how often do participants describe loneliness in terms that involve a piece of household furniture? The manifest version is, count every occurrence of the words chair, couch, sofa, bed, table across all twenty transcripts. Straightforward, replicable, and largely uninformative on its own.

**Kiffer:** Right. The latent version codes any passage in which a piece of furniture stands in for an absent person or a lost role. Linda's chair is the empty space Bill used to fill. Helen's armchair is the place where she doesn't have anyone to read with. The latent code furniture-as-trace-of-absent-other might apply to nine transcripts versus eleven for the manifest word-list, but the latent code is doing meaningful analytic work while the manifest list mostly isn't. In a real analysis you'd often want both. The manifest list as a fast first pass and the latent code as the substantive analytic move.

**Sarah:** And the contemporary compromise is that content analysis routinely codes both, but the analyst must declare which is which and must demonstrate that latent codes can be applied reliably.

**Kiffer:** That's the move. And the reliability targets are different. Manifest coding with a clear word-list will typically hit Krippendorff's alpha of point eight zero or higher easily. Latent codes in the point six seven to point eight zero range are acceptable for tentative inference. At or above point eight zero, you can make definitive claims.

**Sarah:** Now the operational move that defines the method. Codes as variables. Walk me through why this is so transformative.

**Kiffer:** Yeah. In thematic analysis, which we did earlier in the course, a code is a label you attach to a passage. The code's job is to organize interpretation. In content analysis, the same code becomes a variable. A column in a data frame, with a value for every unit in the corpus. Once codes are variables, the entire apparatus of descriptive and inferential statistics is available.

**Sarah:** So consider the code loneliness-as-residue-of-marriage. In thematic analysis, it's a finding. This is one of the kinds of loneliness participants describe. In content analysis, it becomes a variable that takes the value one for every transcript in which the theme appears and zero elsewhere.

**Kiffer:** Right. Now you can ask, in what proportion of transcripts does the theme appear? Does that proportion differ between widowed and non-widowed participants? Is the difference larger than chance? Does the proportion grow over the course of the interview? Each of these is a statistical question that the codes-as-variables move makes available.

**Sarah:** And the textbook is explicit that this is what places content analysis in the hybrid position it occupies. It's not qualitative analysis with numbers attached, which is what Berelson thought it was, and it's not a quantitative method applied to text. It's a method in which qualitative judgment is required to define the variables and quantitative procedures are used to analyze them. Both halves are necessary.

**Kiffer:** Right. The way I sometimes put it to students is that the qualitative half does the work of saying what's worth counting, and the quantitative half does the work of checking whether the patterns are larger than noise. Take either half away and you've got a less defensible analysis.

**Sarah:** Quick question before we move on. When is content analysis the right tool? Not every qualitative question calls for it.

**Kiffer:** Good question. The distinctive payoff is in questions that involve distributional comparison. How often something appears, whether subgroups differ in their use of a code, whether something changes over time. Caregivers versus non-caregivers. Younger versus older. Pre-pandemic versus post-pandemic. Where the question is genuinely interpretive, what does this single passage mean, or how does this participant make sense of their experience, you want thematic analysis or schema analysis or narrative analysis, which we'll cover next time.

**Sarah:** And content analysis scales in a way other qualitative methods don't. Twenty transcripts is on the lower end. For corpora of fifty, two hundred, or five thousand documents, content analysis is one of the few options.

**Kiffer:** Right. And it's the right tool when you're writing for an audience that wants the kind of distributional evidence content analysis produces. Public-health journals routinely ask for how many rather than how rich. Where thematic analysis would be rejected as too impressionistic, content analysis can be persuasive.

**Sarah:** Okay. Let's move from what content analysis is to how you design one. The lesson breaks design into three decisions in order. Sampling, units, and reliability.

**Kiffer:** Right. And Krippendorff's most enduring methodological contribution after the alpha statistic is the distinction between three levels of unit in any content-analytic design. Sampling units are the chunks of text drawn from the population. In our twenty-transcript loneliness corpus, each transcript is a sampling unit. Recording units are what you actually code. The recording unit is where the variable takes its value. Could be the whole transcript, could be the paragraph, could be the sentence. And context units are the surrounding text the coder is permitted to consult when deciding how to code a recording unit.

**Sarah:** And the choice of recording unit shapes both what the analysis can show and how reliable it will be.

**Kiffer:** Yeah. A whole-transcript recording unit lets you say of the twenty participants, how many invoke the loneliness-as-residue-of-marriage code at any point. But it can't tell you about intensity or within-transcript distribution. A sentence-level recording unit can tell you what proportion of Linda's sentences contain spatial metaphors for absence, but the coding burden is enormous and reliability on latent codes drops. For a twenty-transcript capstone, the recommended choice is transcript-level recording units with paragraph-level context. It gives you a clean codes-by-participants matrix that supports chi-squared comparisons across subgroups.

**Sarah:** And then the coding scheme. The codebook. Three properties matter more in content analysis than they did in thematic analysis.

**Kiffer:** Right. First, exhaustive coverage. Every recording unit can be assigned at least one code or the explicit not-coded residual. Second, clear operational definitions. Every code has a brief substantive description, an inclusion rule, what counts, and an exclusion rule, what doesn't count. Mentions of loneliness is not a definition. Any statement in which the participant attributes loneliness to a specific event or relationship, excluding generic statements about loneliness in society at large, that's a definition you can apply reliably.

**Sarah:** And on the question of whether codes should be mutually exclusive. Berelson said yes. The contemporary field says not necessarily.

**Kiffer:** Right. Berelson's classical position was that codes should be mutually exclusive on the grounds that overlapping codes make the resulting frequencies hard to interpret. Krippendorff and the contemporary field reject the mutual-exclusivity requirement. A passage can simultaneously instantiate loneliness-as-existential-fact and loneliness-coped-with-by-pet-companionship, and there's no good reason to force the coder to choose. The modern compromise is multi-coded passages are permitted if the codebook says so and the multi-coding is consistent.

**Sarah:** And third, the a priori versus emergent question.

**Kiffer:** Yeah. A priori codes come from the literature, your conceptual framework, your interview guide. Emergent codes come from the data. The contemporary hybrid workflow uses an a priori scaffold of five to seven codes you already have from your earlier coding work, then allows up to two emergent codes if the data demand a category that can't be accommodated. The discipline is to document each emergent code with the same care as the a priori ones.

**Sarah:** Now reliability, which is the load-bearing part of any content analysis. Walk me through Krippendorff's alpha.

**Kiffer:** Alpha has three properties that recommend it over the older Cohen's kappa. It accommodates any number of coders, not just two. It accommodates any level of measurement, nominal through ratio. And it handles missing data gracefully. The arithmetic compares the observed disagreement among coders to the disagreement expected by chance, and produces a coefficient that runs from one for perfect agreement, through zero for chance, to negative values for worse than chance.

**Sarah:** And the conventional thresholds.

**Kiffer:** Alpha at point eight zero or above is strong agreement, acceptable for definitive claims. Point six seven to point eight zero is acceptable for tentative inference. Below point six seven is inadequate. The codebook needs revision. Some health-research applications adopt a stricter point seven zero floor. Whatever you adopt, declare it in your methods section before you compute it.

**Sarah:** And the procedure is to compute reliability on a ten to twenty percent subset of the corpus, with two coders working independently. The remaining eighty to ninety percent is then coded by one coder alone, with the assurance that the reliability of the system is documented.

**Kiffer:** Right. And when reliability comes back low, it's a diagnosis, not the end of the analysis. The cause is almost always one of four things. A code definition that's too vague. A code that covers too much and needs to be split. Insufficient coder training. Or, occasionally, a code that's genuinely contested in the corpus, which is itself a finding. Train, code a subset, compute alpha, revise the codebook, re-train, re-code. Two cycles is typical. Three is not unusual.

**Sarah:** Okay. Once you've got a coded matrix with acceptable reliability, the analysis begins. Let's walk through the inferential half. Frequency tables first.

**Kiffer:** Every content-analytic study starts with descriptive frequencies. A first-pass table shows each code, the count of transcripts in which it appears, and the percentage. Even before you do any comparison, the table is doing analytic work. In our worked loneliness corpus, the most prevalent code might be technology-as-double-edged, appearing in fourteen of twenty transcripts. That tells you ambivalence about phones and social media is a near-universal feature of contemporary loneliness narratives, regardless of who the participant is. The least prevalent might be cultural-untranslatability, appearing in four transcripts. And those four are predictable. The immigrant participants.

**Sarah:** And then the comparison move. Cross-tabulation by subgroup.

**Kiffer:** Yeah, this is where content analysis earns its keep. Take the same code distribution and disaggregate it by caregiver status. Ten caregivers, ten non-caregivers. The shame-prevents-disclosure code might appear in nine of ten caregivers but only four of ten non-caregivers. The fading-at-the-edges code might show the reverse pattern, more common in non-caregivers. The technology code might be invariant by subgroup. These patterns are descriptive. The next move is to ask whether they're larger than chance.

**Sarah:** Which is the chi-squared test on codes by subgroup.

**Kiffer:** Right. Applied to a content-analytic frequency table, the chi-squared test asks, is the distribution of this code statistically different between the subgroups? The mechanics are familiar from any quantitative methods class. Here we apply them to qualitative codes treated as variables. For the shame-prevents-disclosure example, nine of ten caregivers versus four of ten non-caregivers, you get a chi-squared of about five point five with one degree of freedom, p around point oh two.

**Sarah:** And practically, the qualitative analyst's contribution comes back in here. The chi-squared test is the warrant for the claim that caregivers in this corpus systematically describe shame as a barrier to disclosing their loneliness. The substantive interpretation, that caregiving identity carries a moral demand of unselfish endurance and admitting loneliness conflicts with that demand, is the qualitative claim.

**Kiffer:** Right. And there's an important caveat for small qualitative corpora. Chi-squared assumes expected cell counts of at least five. In a twenty-transcript study with rare codes, that assumption often fails. The standard remedy is Fisher's exact test, which gives an exact p-value regardless of cell size. In R, fisher dot test is a drop-in replacement for chisq dot test. For a twenty-transcript corpus, Fisher's exact is almost always the more defensible choice.

**Sarah:** And the inferential register matters here. A chi-squared test on a purposive sample of twenty transcripts is not warranting a population-level prevalence claim.

**Kiffer:** Right. It's answering a within-corpus question. Is the distributional difference between caregivers and non-caregivers in this corpus larger than would arise by chance, given the corpus size? That's a legitimate question even for small, non-probabilistic samples. The methods section should say, within this purposive corpus of twenty transcripts, the difference in code prevalence between caregivers and non-caregivers exceeded what would be expected by chance, Fisher's exact p equals point oh one nine. The finding warrants further investigation in a larger study but cannot itself be generalized. That hedge is doing work.

**Sarah:** And there's also trend analysis, when the corpus has a time dimension.

**Kiffer:** Right. Coverage of a topic across years of newspaper articles. Posts on a forum across the pandemic. The classical example is Pool's nineteen fifty-two analysis of editorial coverage of the Soviet Union across decades. The contemporary example is computational content analyses of social-media discourse before, during, and after specific events. Our loneliness capstone is cross-sectional, so trend analysis isn't the dominant move, but you can do quasi-trend analysis if you've got participants who lived through specific events as adults versus as teenagers, for example. Or you can use it as a check on whether your codebook over-fit early transcripts by comparing the first ten and last ten interviews.

**Sarah:** Let's move to dictionary-based content analysis. This is the manifest extreme of the method.

**Kiffer:** Yeah. Dictionary-based content analysis is the variant in which a pre-specified word list is applied to the corpus by a computer, and the resulting counts are treated as content-analytic variables. The best-known dictionary is L I W C, the Linguistic Inquiry and Word Count dictionary, developed by James Pennebaker and colleagues from the early nineteen nineties. The current version is L I W C twenty-two. It's a curated set of about ninety categories. Positive emotion, negative emotion, anxiety, sadness, body, health, family, social processes, cognitive processes. Run it over a transcript and it returns the percentage of words in each category.

**Sarah:** And it's been used extensively in health-related text analysis. Predicting depression from writing samples, characterizing trauma narratives, predicting suicide risk from social-media posts.

**Kiffer:** Right. It's the methodological ancestor of contemporary sentiment-analysis systems and remains in active use. And then there are the lighter-weight sentiment dictionaries. Bing Liu's lexicon, the AFINN lexicon, the N R C Word-Emotion Association Lexicon. All three are available in the R tidytext package and can be applied to a transcript corpus in under fifty lines of code.

**Sarah:** And the choice between dictionary-based and human-coded is not exclusive.

**Kiffer:** It really isn't. Most rigorous contemporary studies use both. Dictionary methods for the manifest, scale-sensitive, replicable counts. Human-coded latent analysis for the meaning-sensitive, ambiguity-tolerant, theory-engaged interpretation. The danger is treating dictionary output as the whole analysis. L I W C might say a transcript scores four point two percent on sadness, but it can't tell you that the sadness is bereavement-specific, that it co-occurs with relief, or that the participant is describing the sadness ironically. Only human coding can.

**Sarah:** And there's a related computational move called keyness analysis, which sits in the quanteda package in R.

**Kiffer:** Right. Keyness is the manifest-content cousin of the chi-squared analysis on codes. It asks, which words appear disproportionately in one subcorpus compared to another? The classical implementation is a log-likelihood ratio test on word frequencies between two subcorpora. For the loneliness dataset, you might compare caregiver versus non-caregiver transcripts. Words like kids, mom, dad, appointment, exhausted will load on the caregiver side. Words like chair, walker, eyes, fading, alone, quiet load on the non-caregiver side. The keyness analysis is manifest content analysis at scale, and it triangulates with your latent codebook results.

**Sarah:** And computational content analysis, topic models, word embeddings, L L M-based coding, are the scalable extension of all this, which we'll get to in the final lesson.

**Kiffer:** Right. And the textbook's stance, and our stance in this course, is that computational content analysis is a powerful complement to human content analysis, not a replacement. Module twelve gives you the operational machinery. This module establishes the framework that machinery extends.

**Sarah:** Let's land the practical piece. The week eight milestone is a content-analytic frequency table from a coded subset with at least one defensible quantitative comparison.

**Kiffer:** Right. Four deliverables. The Taguette export from your coded transcripts. The frequency table as Excel or C S V. The R script that produces the chi-squared or Fisher's exact comparison. And a five-hundred-word interpretive memo that triangulates the quantitative comparison with the qualitative reading.

**Sarah:** And the memo is doing real work. It's not a recap. The lesson lists what the memo should contain. The codes used with brief operational definitions. The subgroup contrast you chose and why. The descriptive finding with the frequency table inline. The inferential test result, properly hedged for the small-sample context. A qualitative reading that explains the quantitative pattern. And one limitation specific to having promoted codes to variables.

**Kiffer:** Yeah. And the limitations point is interesting because it's specific to this hybrid move. The codes-as-variables move buys you statistical power and replicability. The cost is that you've collapsed within-passage complexity into a binary indicator. A passage that articulates the code subtly and a passage that articulates it bluntly both register as one. That's a real limitation, and the memo should name it.

**Sarah:** And one more practice the lesson recommends. Pre-register your prediction before running the analysis.

**Kiffer:** Right. Declare in writing what comparison you intend to test, which code you predict will differ most, and what the substantive reasoning is. It's basically the pre-registration logic of quantitative work carried over into the qualitative space. It protects you from p-hacking and it gives you a way to discuss findings that contradicted your expectation in the memo, which is often the most analytically valuable section.

**Sarah:** And the R workflow itself is pretty clean. Load the Taguette export, merge participant metadata, pivot from long to wide to get a codes-by-cases matrix, compute frequency tables with dplyr, visualize with ggplot, run Fisher's exact across all codes in a loop, and optionally run a quanteda keyness analysis on the underlying text.

**Kiffer:** Right. The pivot wider call is, in a literal data-shape sense, the operationalization of the codes-as-variables move. Long format, one row per passage-code pair, becomes wide format, one row per participant and one column per code. After that pivot, everything you'd do with survey data, you can do with these codes.

**Sarah:** Before we land the takeaways, can you say a word about what to do when you've got a code that just won't stabilize reliability-wise? The textbook lists four causes, but in practice students get stuck.

**Kiffer:** Yeah. The most common pattern I see is a code that's trying to do two jobs at once. Like a code called shame that's catching both the felt emotion of shame and the anticipated social judgment of being seen as lonely. Those are different things. The coder ends up making different calls on different days because the conceptual category is conflating two phenomena. The fix is to split. Have one code for felt shame and one for anticipated stigma. Most reliability problems turn out to be conceptual problems wearing a procedural disguise.

**Sarah:** And the other failure mode I see is a code that's too narrow. So narrow that it almost never applies, and when it does, two coders agree on roughly the same passages, but the alpha statistic is unstable because there are so few positive cases.

**Kiffer:** Right. Rare codes can produce misleading alphas in either direction. A code that applies twice in twenty transcripts can come out with very high alpha if both coders catch both instances, or very low alpha if they catch different ones. The remedy is to consolidate the rare code with related codes into a broader category, or to acknowledge in the methods that the code's reliability estimate is unstable due to base rates.

**Sarah:** And one practical note for students. The R package for computing Krippendorff's alpha is called irr, the irr package, and there's also kripp dot alpha as a function. The procedure is to bring your two coders' ratings into a matrix with one row per coder and one column per unit, and pass it to the function. Reliable, fast, gives you the coefficient and a bootstrap confidence interval.

**Kiffer:** Right. And the confidence interval matters because, again, with small reliability subsets, the point estimate can move a lot. An alpha of point seven nine with a confidence interval from point six to point nine three is not in fact a point seven nine. It's a wide range. Reporting the interval is good practice.

**Sarah:** Okay. Let's pull this together. Seven takeaways.

**Kiffer:** Sure. First, content analysis has a ninety-year history. Lasswell established it for propaganda analysis. Berelson codified it in nineteen fifty-two as systematic, quantitative description of manifest content. Krippendorff modernized it to accommodate latent meaning, inference, and rigorous reliability statistics.

**Sarah:** Second, manifest content is what the text literally says. Latent content is what it means. Most contemporary content analysis blends both, with the analyst declaring which is which and demonstrating reliability for the latent codes.

**Kiffer:** Third, the codes-as-variables move makes content analysis a hybrid method. Qualitative judgment defines the codes. Quantitative procedures analyze them. Both halves are necessary.

**Sarah:** Fourth, three unit-types structure any content-analytic design. Sampling units, what you draw from the population. Recording units, where the variable takes its value. Context units, the surrounding text the coder consults while coding.

**Kiffer:** Fifth, reliability is load-bearing. Krippendorff's alpha is the field standard. Alpha at point eight zero or above supports definitive claims. Point six seven to point eight zero is acceptable for tentative inference. Below point six seven, revise the codebook.

**Sarah:** Sixth, inferential tests on codes by subgroup matrices, chi-squared and Fisher's exact, are legitimate when the question is whether a within-corpus distributional difference is larger than chance. They're not legitimate as warrants for population-level prevalence claims. Hedge accordingly.

**Kiffer:** And seventh, dictionary-based content analysis like L I W C, keyness analysis with quanteda, and the computational extensions we'll cover later, are scalable complements to human coding, not replacements. The most rigorous studies use both halves together.

**Sarah:** And one more meta-point. Content analysis sits on the boundary between qualitative and quantitative research. Some methodological writers treat it as a qualitative method with numbers. Some treat it as a quantitative method applied to text. The right answer, the Bernard-Wutich-Ryan answer, is that the boundary was always more rhetorical than real. Content analysis is genuinely hybrid.

**Kiffer:** Right. And the productive stance for students coming out of this course is to become methodologically omnivorous. You can do credible qualitative work. You can read quantitative work. You can deploy content analysis without feeling like you've betrayed either side. What's not defensible is treating content analysis as something to be done apologetically, either as a watering down of qualitative work or as a half-hearted attempt at quantification.

**Sarah:** That's a good place to land. Next time we move in a very different direction. Schema and narrative analysis, which trades the breadth of cross-corpus counting for the depth of within-transcript cognitive structure.

**Kiffer:** Right. Counting across many transcripts is one mode of qualitative rigor. Tracing the cognitive architecture inside a single transcript is another. We'll see how schema analysis from cognitive anthropology and narrative analysis from sociolinguistics give us tools the content-analytic frame can't.

**Sarah:** Thanks for joining us today.

**Kiffer:** Thanks everyone. We'll see you in Lesson nine.