# Lesson 5 — Themes and Codebooks (v3 expanded)

*Companion-podcast transcript • Sarah & Kiffer*  
*~5,600 words • ~30 min audio*

---

**Sarah:** Welcome back to Office Hours. I'm Sarah.

**Kiffer:** And I'm Kiffer. Today is Lesson 5 of the qualitative methods arc. Themes and codebooks. We have arrived at the central act of analytic work on text. The first four lessons were the upstream apparatus. An operational definition of qualitative data analysis. A research question. A sampling logic. A data-collection procedure. Now we open the transcripts and ask: how do we find patterns systematically, and how do we record what we find in a form another analyst could follow?

**Sarah:** Two activities sit at the heart of analytic work on text. Finding themes and building a codebook. They are tightly coupled but not the same.

**Kiffer:** Right. Finding themes is the discovery phase. The search for what recurs, what surprises, what is missing, what coheres. Building a codebook is the codification phase. Turning what you found into operational rules that you and other analysts can apply consistently across the rest of the corpus.

**Sarah:** And the lesson covers chapters five and six of the textbook together because the two activities are interleaved in practice.

**Kiffer:** They are. You don't finish theme-finding and then start codebook-building. You loop between them.

**Sarah:** Okay, first big move. Vocabulary. Theme, code, category, concept. These get used interchangeably in published qualitative work, and the textbook tightens them up.

**Kiffer:** Worth getting this right. A theme is a recurring abstract idea you identify in the data. It's the analyst's product. "Loneliness as the cost of love" is a theme. "Spatial metaphor for absence" is a theme. Themes are typically expressed as short phrases rather than single words. They sit at a higher level of abstraction than the individual statements that supply evidence for them.

**Sarah:** And a code.

**Kiffer:** A code is the operational label you attach to a passage of text when you encounter an instance of a theme or sub-theme. Codes are the working units. They are what you mark up in Taguette or NVivo. A single theme may be supported by several codes. A single passage may receive multiple codes. The code is the marker. The theme is what the markers, when assembled, are about.

**Sarah:** A category.

**Kiffer:** A category is a grouping of related codes. In a hierarchical codebook, categories are parent nodes and codes are children. "Coping strategies" is a category that might contain codes for phoning a confidant, watching comfort television, going to a coffee shop, cooking food from home. The category organizes. The codes do the marking.

**Sarah:** And concept.

**Kiffer:** The most abstract of the four. A concept is a theoretically meaningful idea that may organize many themes. Liminality is a concept. Embodiment is a concept. Structural exclusion is a concept. Concepts are typically borrowed from theoretical literatures and used to organize themes into something a discipline can argue about.

**Sarah:** And worth flagging the levels of abstraction. The four terms aren't just synonyms with longer names.

**Kiffer:** Right. They sit at different heights. Code is lowest — operational marker. Theme is mid — recurring idea. Category is mid — organizing bucket. Concept is highest — theoretical. In the loneliness corpus, "chair-absent-spouse" is a code applied to specific passages. "Spatial objects standing in for absent people" is a theme. "Material traces of relationship loss" is a category containing codes for chairs, sides of beds, photographs, mobility aids. And "embodied memory" or "material culture of grief" are the concepts that sit above the theme and connect it to broader theoretical literatures. The hierarchy is what makes your eventual paper say more than a description.

**Sarah:** And if you skip a level the paper feels thin.

**Kiffer:** Yeah. A paper that goes straight from coded passages to "themes" without elaborating categories and concepts reads as a list. A paper that does the work at all four levels reads as analysis.

**Sarah:** And the textbook is emphatic about one phrasing that should disappear from qualitative papers.

**Kiffer:** Themes do not emerge from data the way fossils emerge from rock. The analyst notices them, names them, decides which to keep, and decides where the boundaries are. The phrasing "themes emerged from the data" — which is ubiquitous in published qualitative papers — obscures the analytic work and is one of the textbook's pet peeves. A more honest phrasing is "we identified the following themes through inductive coding" or "the themes below were developed iteratively as we read across the twenty transcripts."

**Sarah:** Themes are found, not discovered.

**Kiffer:** That's the move. And that small linguistic shift is doing real methodological work. It restores the analyst as a visible agent in the production of findings.

**Sarah:** Okay. The most cited tool in this whole lesson is Ryan and Bernard's twelve techniques for finding themes. Two thousand three paper. Foundational because it disaggregated what had previously been described as "immersion" or "reading deeply" into a set of specifiable operations.

**Kiffer:** Right. The twelve techniques. Let me walk through them. Most projects use several in combination.

**Sarah:** Start with repetitions.

**Kiffer:** The most obvious technique. What words, phrases, ideas recur across transcripts? If multiple participants reach for the same word to describe something, that word is doing analytic work. In the loneliness corpus, the word "chair" appears in at least eight transcripts as a stand-in for an absent person. Linda's "Bill's chair." Frank's chair. Helen's mention of the chair her brother used to sit in. The word "tired" recurs across caregiver and bereaved participants in a way that goes beyond ordinary fatigue. It appears to mark a specific exhaustion of grieving as work.

**Sarah:** Second. Indigenous typologies. Emic terms.

**Kiffer:** What categories do participants themselves use? When a participant reaches for a non-English word, a slang term, a phrase that operates as a category in their world — the analyst should pay attention. Amira uses the Arabic word "wahda" to name something the English category of loneliness cannot hold. Aarav uses paired Sanskrit-Hindi terms that distinguish chosen solitude from involuntary aloneness. Marcus speaks of "code-switching" loneliness — an experience he names that has no neat one-word English equivalent. These emic categories are gifts. They often become themes that organize entire sections of an eventual paper.

**Sarah:** Third. Metaphors and analogies.

**Kiffer:** People describe abstract experiences, especially feelings, by reaching for concrete images that map onto them. Cataloguing the metaphors a corpus uses is a powerful theme-finding move. The loneliness corpus is dense with spatial metaphors of absence and erosion. Maya feels "hollow." Sarah describes "witness-less hours." Helen describes "fading at the edges." Frank uses imagery of slow disappearance. Maya again talks about feeling she could "disappear and nobody would notice." Linda talks about "walking around with that absence." Read together, the metaphors converge. Loneliness is repeatedly figured as a thinning or vanishing of the self.

**Sarah:** Fourth. Transitions.

**Kiffer:** What does the participant move to right after they say what they say? Transitions are turn-taking shifts and topic changes. They tell you what feels related in the participant's mind. In Linda's transcript, every passage about Bill's chair is followed by a passage about Rufus the dog. "I haven't moved it. The dog is what keeps me up in the morning." The transition from the empty chair to the dog suggests an analytic linkage. The dog is functioning as the affective replacement for the chair's absence.

**Sarah:** Fifth. Similarities and differences. Constant comparison.

**Kiffer:** Read two passages side by side. What is the same? What is different? This is the constant-comparative move Glaser and Strauss made foundational to grounded theory. As a theme-finding technique it's most useful when you have already identified candidate themes and want to test whether they hold up across subgroups. In the corpus, the bereaved-spouse loneliness of Linda — sixty-seven, widow of three years — and Frank — eighty-one, widower of one year — share most features but differ on duration of mourning and the role of children. The comparison sharpens the theme rather than dissolving it.

**Sarah:** Sixth. Linguistic connectors.

**Kiffer:** Words like "because," "since," "as a result," "therefore," "that's why," "so." Causal connectives. They mark places where the participant is explaining a relationship between events or states. Searching for them is a fast way to find passages where causal accounts of loneliness appear. Linda's transcript contains "if you love deeply for a long time, you will, eventually, be lonely deeply for a long time. The two are connected." That explicit linkage of love and loneliness is the participant's own causal theory.

**Sarah:** Seventh, and one of the most powerful in my view. Missing data.

**Kiffer:** What's missing is sometimes the most analytically informative finding. The textbook is emphatic about this. In the loneliness corpus, three passages from Maya, Linda, and Helen each describe loneliness in embodied terms — hollow, weight, fading. None of the three uses psychiatric vocabulary. None mentions a therapist or a medication. None reaches for the word "depression," even though the descriptions are compatible with depressive symptomatology. The absence of clinical framing is itself a finding. These participants are describing loneliness as a normal embodied condition, not as a disorder. That has implications for how an intervention should be framed.

**Sarah:** Structured absence is a finding.

**Kiffer:** That's the methodological move. The eighth technique is theory-related material. The ninth is cutting and sorting — physically or digitally cutting passages from transcripts and grouping them by similarity. The tenth is word lists and key-words-in-context, the basic moves of computer-assisted text analysis. The eleventh is pawing — careful re-reading with marginal annotations. And the twelfth is metacoding — coding the codes themselves, looking for higher-order themes across categories.

**Sarah:** And the methodological discipline is that you typically use several in combination.

**Kiffer:** Yeah. Most defensible theme-finding work uses three or four of the twelve techniques explicitly, and the methods section names them. "We identified themes using a combination of repetitions, indigenous typologies, and missing-data analysis." That's a sentence that gives the reader real information about how the themes were developed.

**Sarah:** I want to dwell on the worked example for a second. Because three transcripts and four techniques already got you to the beginnings of a real theme.

**Kiffer:** Yeah. Repetitions surfaced that loneliness is embodied for these participants. Metaphors surfaced that loneliness is figured as a thinning of the self. Similarities-and-differences surfaced that the body registers absence as presence — different participants, same analytic abstraction. And missing data surfaced that none of them invokes psychiatric vocabulary. Four techniques applied to three passages already give you a theme — loneliness as embodied absence, narrated outside clinical vocabularies. That theme can now become one or several codes in your codebook. The techniques aren't doing magic. They're doing what an analytic eye does, made explicit and specifiable.

**Sarah:** Which is the whole methodological move. Make implicit interpretive work specifiable.

**Kiffer:** Right. That's what the textbook is offering. Twelve techniques that name what experienced qualitative researchers have always done implicitly, so that less experienced researchers can do it explicitly, and so the methods section can describe what was actually done.

**Sarah:** Okay, second big chunk. From themes to codes. Where do codes come from.

**Kiffer:** Three strategies. Inductive, deductive, hybrid. Inductive coding starts with the data. You read transcripts, mark passages that strike you, label the marks with provisional codes, refine the labels as you read more. After three or four transcripts you have a working set of maybe thirty to fifty codes. You consolidate, merge, and rename until you have a coherent codebook of eight to fifteen codes you can apply across the remaining transcripts. Inductive coding is the default for exploratory studies, for under-described phenomena, and for projects in the grounded-theory tradition.

**Sarah:** Its virtue.

**Kiffer:** Stays close to the participants' own categories. Avoids forcing the data into pre-existing analyst frames. Its risk is that without theoretical anchoring it can produce codebooks that are descriptive but not analytically interesting. Charmaz warns against "coding too close to the data."

**Sarah:** Deductive coding.

**Kiffer:** You bring a codebook to the data. Codes come from theory, prior literature, or a stakeholder framework. The Cacioppo-Patrick loneliness model. Weiss's social-emotional loneliness typology. You read each transcript and tag passages that instantiate the pre-specified codes. Codes that aren't present are recorded as absent. Right move when you're testing or extending an existing framework, working in confirmatory mode, or part of a multi-site collaboration that needs a shared coding scheme. The risk is you miss what the framework wasn't designed to see.

**Sarah:** And hybrid.

**Kiffer:** The practical default. You begin with a small set of deductive codes drawn from theory or prior literature — a provisional codebook — apply them to the first few transcripts, and let new codes emerge inductively as you read. The codebook grows from the bottom up while keeping its theoretical anchor. Fereday and Muir-Cochrane's two thousand six paper is the widely cited operationalization. Textbook endorses hybrid as the practical default for applied health research. It preserves inductive openness while keeping deductive anchoring that makes results interpretable by quantitatively trained reviewers.

**Sarah:** Let me ask about the codebook size question. Because students often start with thirty codes and panic.

**Kiffer:** Yeah, that happens. Initial inductive coding does often produce thirty to fifty provisional codes. The consolidation work is real. You're looking for codes that are essentially the same thing under different labels, codes that should merge, codes that should split into more specific children, and codes that don't actually have enough instances to be analytically useful. The target is usually eight to fifteen codes you can apply consistently. A codebook of thirty codes is hard to apply consistently and hard to defend in a methods section. A codebook of three codes is probably not doing enough work.

**Sarah:** And the merging process is part of the analysis.

**Kiffer:** It is. Deciding that two codes are really the same thing requires you to articulate what they have in common, which is itself a piece of theoretical work. Deciding that one code should split into three requires you to articulate what the relevant variation is. The codebook is not just a filing system. It's a piece of your theory in progress.

**Sarah:** Now the architecture of a codebook entry. The textbook recommends seven elements per code.

**Kiffer:** All seven matter. Cutting any of them is the most common reason intercoder reliability later turns out to be low.

**Sarah:** Walk through them.

**Kiffer:** Code name. Short, mnemonic, unique. Use hyphens or underscores, not spaces. Brief definition. One sentence. Full definition. A paragraph. When to apply, what range of cases, how it relates to neighbouring codes. Inclusion criteria, bullet-pointed. What features must be present. Exclusion criteria, bullet-pointed. What looks similar but doesn't count. Positive example. A direct quote from the corpus that clearly fits. Negative example. A near-miss quote that doesn't fit.

**Sarah:** And many experienced researchers add an eighth.

**Kiffer:** A memo space attached to each code. Where the analyst records how the code evolved, why edge cases were resolved a particular way, what the code's relationship to neighbouring codes turned out to be. Memos are the audit trail for the codebook itself. We strongly recommend including a memo column in your capstone codebook. It's what your eventual methods section will be written from.

**Sarah:** Walk listeners through a worked example.

**Kiffer:** Sure. Code name, "somatic absence." Brief definition: participant describes loneliness as a bodily sensation that registers the absence of someone or something. Full definition is a paragraph distinguishing it from fatigue-grief — which is exhaustion specifically tied to grief work — and from illness-talk — which is the description of medical symptoms. When in doubt, apply somatic-absence if the participant uses a bodily metaphor and explicitly or implicitly links it to absence.

**Sarah:** Inclusion criteria.

**Kiffer:** Bodily reference present. The reference is figurative or interpretive, not a literal medical complaint. The reference is tied to the absence of a person, role, or part of life. Exclusion criteria: literal medical symptoms — apply illness-talk instead; mental-state descriptions without a body reference — apply affective-loneliness instead; bodily references not tied to absence.

**Sarah:** Positive example.

**Kiffer:** Linda. "It's more like a weight. A weight that I carry around. I'm walking around with that absence." Negative example: Helen. "My hip gave out two years ago." Literal medical complaint. Apply illness-talk.

**Sarah:** And the memo.

**Kiffer:** Added on iteration two after noticing the convergence of Maya's "hollow" and "ache," Linda's "weight," and Helen's "fading." Distinguished from affective-loneliness after a near-miss in Sarah's transcript — "witness-less hours" was tagged both ways; resolved by requiring a body reference for somatic-absence.

**Sarah:** Worth dwelling on the negative example feature. Because it's the one students most often skip.

**Kiffer:** And it's the one that does the most work for intercoder reliability. The positive example tells the second analyst what fits the code. The negative example tells them what looks similar but doesn't fit. Without the negative example, the boundary of the code is invisible, and the second analyst has to reconstruct it from inference. Negative examples are where you put the edge cases — the near-misses that taught you what the code does and does not include. They are the boundary of the code made visible.

**Sarah:** A practical move I've seen work.

**Kiffer:** Yeah, common one. When you encounter a near-miss during coding — a passage that tempted you to apply the code but you decided against — copy that passage straight into the negative example slot in the codebook, with a one-line note about why you decided against. That practice keeps your reasoning legible to your future self and to the second coder.

**Sarah:** And the codebook is a living document.

**Kiffer:** That's the methodological commitment. Bernard, Wutich, and Ryan are clear that a codebook is not built once and frozen. You will revise it. The standard expectation is that revisions are documented in an audit trail recording the date of revision, which codes changed, what they changed to, and the justification. When a codebook is revised midway through a project, the earlier transcripts must be re-coded under the new scheme. Not the old one. Or the codebook becomes inconsistent across the corpus.

**Sarah:** Okay third big chunk. Coding mechanics.

**Kiffer:** Four to know. Hierarchical codes. Multiple codes per passage. Axial coding. In vivo codes. A codebook is rarely flat. Codes nest under broader codes, which nest under categories. The hierarchy is what makes a large codebook navigable and what lets you aggregate findings at different levels. You can report at the category level — "coping strategies appeared in all twenty transcripts" — or at the code level — "coping-pet appeared in six of twenty transcripts."

**Sarah:** A small but useful point about hierarchical codes. There's a temptation to over-engineer the hierarchy.

**Kiffer:** Yeah. Three levels is usually enough. Category at the top, code in the middle, and occasionally a sub-code under a code. Four or five levels deep gets unwieldy and the structure starts working against the analysis rather than for it. The hierarchy should be in service of the analytic argument, not an end in itself. A flat codebook with ten well-defined codes is often more useful than a nested codebook with thirty codes spread across five levels.

**Sarah:** And the software encourages over-engineering.

**Kiffer:** It does. Taguette and NVivo make it easy to build deep hierarchies, and the tooling can seduce you into doing it just because you can. The discipline is to use only the structure your analysis actually requires.

**Sarah:** Multiple codes per passage.

**Kiffer:** A single passage may instantiate more than one code. Normal and expected. Linda's chair passage simultaneously instantiates a specific code for chair-absent-spouse, a more general somatic-absence code, and the interpretive frame loneliness-as-cost-of-love. All three apply to overlapping text. Taguette and the major qualitative data analysis packages handle multi-coding natively.

**Sarah:** And the implication.

**Kiffer:** When you later count code occurrences, you're counting passage-code pairs, not unique passages. A transcript with eighty unique coded passages might have a hundred and thirty code applications because many passages got two or three codes. Both numbers are meaningful. Report whichever supports your argument and be clear which you're reporting.

**Sarah:** Axial coding.

**Kiffer:** The term comes from Strauss and Corbin's grounded-theory tradition. A second pass over the data, after initial coding, in which the analyst attends to the relationships between codes rather than to the codes themselves. Which codes co-occur? Which codes appear in sequence? Which codes seem to be causes of which others, in participants' own accounts? Which codes are mutually exclusive in practice? Axial coding turns a flat codebook into a model.

**Sarah:** And in vivo codes.

**Kiffer:** A code whose name is taken verbatim from a participant's speech. The convention is to set in vivo codes in quotation marks in the codebook. In the loneliness corpus, defensible in vivo codes include "wahda" — Amira's word — "witness-less hours" — Sarah's phrase — "the cost of love" — Linda's interpretation — "code-switching loneliness" — Marcus's articulation — and "fading at the edges" — Helen's metaphor. In vivo codes do two analytic things at once. They keep the participant's voice in the codebook, protecting against analyst over-abstraction. And they give the eventual paper memorable language. Reviewers and readers remember "wahda" in a way they don't remember refugee-specific-loneliness.

**Sarah:** Most well-written qualitative findings sections have at least three or four section headers built from in vivo codes.

**Kiffer:** That's the convention. And it does real work in keeping the paper grounded.

**Sarah:** Okay, last big chunk. Intercoder reliability. When two analysts apply the same codebook, how much agreement should you expect, how do you measure it, and what does the resulting number mean.

**Kiffer:** Intercoder reliability is the operational answer to the replicability commitment from Lesson 1. Two analysts independently code the same passages using the same codebook. You compute the agreement. You decide whether the agreement is good enough. Three measures matter.

**Sarah:** Percent agreement first.

**Kiffer:** Most intuitive. Count the passages on which two coders agree, divide by the total number of passages coded, multiply by a hundred. If coders agree on eighty-five of a hundred passages, percent agreement is eighty-five percent. The problem with percent agreement is that it doesn't adjust for the agreement you'd expect by chance. If two coders are using a codebook with only two codes and one is used ninety percent of the time, two random coders would agree about eighty-two percent of the time by accident. An eighty-five percent score in that setting reflects only a few percentage points of real agreement above chance.

**Sarah:** Hence chance-corrected measures.

**Kiffer:** Cohen's kappa is the chance-corrected agreement measure used most often in qualitative health research. Cohen, nineteen-sixty. Kappa ranges from minus one — perfect disagreement — through zero — chance-level agreement — to plus one — perfect agreement. Landis and Koch proposed interpretive thresholds in nineteen seventy-seven that have become field standard. Below zero, poor. Zero to point two, slight. Point two to point four, fair. Point four to point six, moderate. Point six to point eight, substantial. Point eight to one, almost perfect. Most published applied health qualitative work reports kappas in the point six to point eight range, with point seven as a common minimum threshold.

**Sarah:** And Krippendorff's alpha for more than two coders or ordinal codes.

**Kiffer:** Krippendorff's alpha generalizes kappa for any number of coders and for nominal, ordinal, interval, or ratio data. More flexible. Recommended when you have three or more coders or when your codes have ordinal structure. Implemented in the R package "i r r" — which is in the toolchain you installed in Lesson 1.

**Sarah:** One thing on intercoder reliability that's worth naming explicitly. The number alone doesn't tell you the work is good.

**Kiffer:** Right. A kappa of point eight on a codebook that's coding trivial features is not informative about the substance of the analysis. A kappa of point six on a codebook coding subtle interpretive features may be more methodologically defensible than a higher kappa on a simpler codebook. The number is meaningful only in the context of what's being coded. The methods section should give the reader enough information about the codebook to interpret the kappa, not just report the number.

**Sarah:** And the disagreement work is itself part of the analysis.

**Kiffer:** This is the part that gets undervalued. When two coders disagree, the most important methodological step is what happens next. The good practice is to discuss every disagreement, identify the source — codebook ambiguity, differing interpretation, edge-case the codebook didn't anticipate — and revise either the codebook or the application accordingly. The disagreements teach you where the analytic boundaries are unclear. Reporting a kappa without describing how disagreements were handled is missing the most analytically interesting part.

**Sarah:** And after the discussion, you re-compute.

**Kiffer:** Right. After codebook revision and re-coding, you re-compute the agreement statistic. The final reported kappa is the one after disagreements have been worked through, not the first raw number. And the methods section should be explicit that the number is post-discussion.

**Sarah:** And when intercoder reliability is the wrong measure.

**Kiffer:** Important caveat. For some methodological traditions — particularly reflexive thematic analysis in the Braun and Clarke sense, and some forms of constructivist grounded theory — intercoder reliability is not the appropriate test of analytic quality. The reasoning is that interpretation is co-constructed and the second coder's job is not to replicate the first coder but to challenge and extend the analysis. If you're working in those traditions, your methods section reports analytic dialogue between coders rather than a kappa.

**Sarah:** Let me ask one practical question. When does a researcher know they're done coding?

**Kiffer:** Two signs, which are related. First, you stop adding new codes. New transcripts don't reveal categories you haven't already named. That's code saturation, the concept we covered in Lesson 3. Second — and this is the deeper one — your existing codes are applying cleanly. Edge cases are rare. The codebook is doing the work you want it to do. When both are true, you're done with the coding phase, even if you're not yet done with analysis. There's still axial coding, theme synthesis, and the move from codes to interpretation, which is the heart of Lesson 6.

**Sarah:** And students often try to stop too early.

**Kiffer:** Or too late. Too early — you stop when you have a codebook that hasn't been stress-tested against the full corpus, and the eventual paper has thin coverage. Too late — you keep adding codes past the point of analytic usefulness, and the codebook gets unmanageable. The middle ground is what the saturation literature is trying to specify.

**Sarah:** Let me try the synthesis. First takeaway. Theme, code, category, concept are not interchangeable. Theme is the recurring abstract idea. Code is the operational marker. Category is the parent group. Concept is the theoretical organizing idea. Getting the vocabulary right is methodological discipline.

**Kiffer:** Second. Themes are found, not discovered. The phrasing "themes emerged from the data" obscures the analytic work. The honest phrasing names the analyst as the agent and names the techniques used.

**Sarah:** Third. Ryan and Bernard's twelve techniques for finding themes are a specifiable toolkit. Repetitions. Indigenous typologies. Metaphors. Transitions. Similarities and differences. Linguistic connectors. Missing data. Theory-related material. Cutting and sorting. Word lists and key-words-in-context. Pawing. Metacoding. Most projects use several in combination, and the methods section should name them.

**Kiffer:** Fourth. Inductive, deductive, and hybrid coding are different strategies, and the hybrid is the practical default in applied health research. Theoretically anchored, inductively responsive.

**Sarah:** Fifth. A defensible codebook entry has seven elements — name, brief definition, full definition, inclusion criteria, exclusion criteria, positive example, negative example. Many researchers add an eighth element, a memo column, that becomes the audit trail and the source material for the methods section.

**Kiffer:** Sixth. Coding mechanics include hierarchical structure, multiple codes per passage, axial coding for relationships between codes, and in vivo codes for participants' own words. In vivo codes do real work in keeping a paper grounded.

**Sarah:** Seventh. Intercoder reliability is the most common operational test of the replicability commitment. Percent agreement is simple but flawed. Cohen's kappa is chance-corrected and field-standard. Krippendorff's alpha generalizes for more coders or ordinal codes. Landis and Koch thresholds — point six to point eight is the working range.

**Kiffer:** And eighth. For some methodological traditions — reflexive thematic analysis, constructivist grounded theory — intercoder reliability is the wrong measure. The second coder's job in those traditions is dialogue, not replication. Know which tradition you're working in, and choose the appropriate quality test.

**Sarah:** A callback to Lesson 4 worth making.

**Kiffer:** The level of transcription you're working with bounds the coding you can do. Intelligent verbatim transcripts support thematic and content analysis well. They don't support conversation analysis. The methods section needs to be coherent about that.

**Sarah:** Anything to carry into next lesson?

**Kiffer:** One thing. Coding is iterative. Your first pass through the codebook will not be your last. The discipline is to revise visibly, with an audit trail, and to re-code earlier transcripts under the revised scheme. The capstone milestone this week is a preliminary codebook. The final codebook you submit is the result of that preliminary codebook revised at least twice on the basis of what the rest of the corpus reveals.

**Sarah:** Next lesson, we step back and ask the higher-level question. Codes are not findings. Coded data are not analysis. The transition most students skip is the move from codes to interpretation. We'll cover the analytic frameworks — text-to-counts, text-to-themes, text-to-schemas, text-to-narratives, text-to-talk — and the move from a coded dataset to a conceptual model.

**Kiffer:** Before then, build a preliminary codebook on three to five transcripts. Include the seven required elements for each code. Try the hybrid strategy — two or three theoretically motivated codes from the interview guide, plus inductive codes from what you read.

**Sarah:** Thanks for listening. We'll see you in Lesson 6.

**Kiffer:** Take care of yourselves. See you in class.
