# Lesson 4 — Qualitative Data Collection (v3 expanded)

*Companion-podcast transcript • Sarah & Kiffer*  
*~5,500 words • ~30 min audio*

---

**Sarah:** Welcome back to Office Hours. I'm Sarah.

**Kiffer:** And I'm Kiffer. Today is Lesson 4 of the qualitative methods arc. Qualitative data collection. And this is the last of the upstream-design lessons before we turn to analysis proper. In Lesson 2 we did research questions and theory. In Lesson 3 we did sampling. Today is about how qualitative data actually get generated.

**Sarah:** Most students arrive at this lesson assuming that qualitative data collection means interviews. And interviews do dominate the chapter eventually. But the textbook opens with a deliberately broader frame.

**Kiffer:** Bernard, Wutich, and Ryan organize the landscape into three families. Indirect observation, where the researcher never interacts with the people whose behavior is being studied. Direct observation, where the researcher is present and watches what's happening. And elicitation, where the researcher actively asks people to produce data — through interviews, focus groups, free lists, written responses. The three families are not a hierarchy. They're tools that answer different kinds of questions, that cost different amounts of money and time, and that come with different ethical obligations.

**Sarah:** Let's walk through them in order. Indirect observation first.

**Kiffer:** Indirect data are data that already exist in the world. Left by past behavior, stored in institutional archives, or generated by other researchers for other purposes. The analyst harvests them without ever asking a participant a question. Eugene Webb and colleagues coined the term unobtrusive measures for this family in nineteen sixty-six. Their book Unobtrusive Measures is the canonical methodological statement.

**Sarah:** And there are two big methodological advantages.

**Kiffer:** Two big ones. First, indirect data are nonreactive. When you interview someone about their loneliness, they know they're being studied. What they say is shaped by the relationship, by the setting, by the desire to appear coherent, by the willingness to disclose. When you analyze a transit log showing how often someone uses public transit alone versus in company, the data have no such reactivity. The person was not performing for you. Reactivity is the qualitative-research equivalent of measurement bias, and indirect data are methodologically cleaner on that dimension.

**Sarah:** Second advantage.

**Kiffer:** Indirect data are typically longitudinal in ways that elicitation cannot match. An interview captures one moment. Maybe two if you do a follow-up. An archival record can stretch back decades. A behavior trace can accumulate over the entire life of the trace-generating activity. If you want to know how loneliness in a community has changed across thirty years, no interview design can give you that. But a content analysis of obituaries, of personal ads, of community-newspaper letters to the editor, of city-council minutes mentioning isolation can.

**Sarah:** And the cost of those advantages.

**Kiffer:** With indirect data, you cannot ask why. You can see the pattern. You cannot ask the person who produced it what it meant to them. That's why combining indirect observation with elicitation — the two-source design — is often the strongest move.

**Sarah:** Three sub-types of indirect data. Walk listeners through them.

**Kiffer:** Behavior traces, archival records, and secondary qualitative datasets. Behavior traces are physical or digital records left by past action. Webb and colleagues divided them into accretion measures — things that have built up, like graffiti, worn paths, garbage — and erosion measures — things that have worn down through use, like the worn linoleum in front of a popular museum exhibit or the most-thumbed page of a Bible in a hotel room. The distinction is more conceptual than practical. The analytic logic is the same.

**Sarah:** And in public health.

**Kiffer:** Behavior traces are useful for the questions respondents cannot or will not answer accurately. A study of harm-reduction service utilization in the Downtown Eastside used the contents of needle-exchange returns — the actual physical objects, not the user-reported counts — to estimate injection-event volumes. The traces were more reliable than self-report. A twenty-twenty study of children's playground use during the early COVID lockdowns used aerial photographs of trampled grass and ball-sport patterns to characterize informal play in periods when interviews and surveys were impossible.

**Sarah:** And for loneliness specifically.

**Kiffer:** A behavior-trace approach to loneliness might use library-borrowing records, grocery-store self-checkout queue patterns at nine on a Sunday — the time Maya in our dataset names as the loneliest — the contents of the "food for one" freezer section across socioeconomically different neighbourhoods, the timestamps of who turns on their porch light first at dusk. None of these is conclusive alone. Combined with the kinds of accounts in our transcripts, they give the qualitative claim a quantitative shadow that's more defensible to a policy audience.

**Sarah:** Second sub-type. Archival records.

**Kiffer:** The institutional records people, organizations, and governments generate as a byproduct of operating. Clinic charts, school-attendance registers, court transcripts, council minutes, child-welfare case files, ambulance dispatch logs. Plus historical documents — diaries, letters, newspapers, pamphlets, public-health reports from earlier eras. Plus media corpora — television news transcripts, podcast episodes, social-media posts, government information campaigns. Plus personal documents — published memoirs, blogs, online support-group threads.

**Sarah:** And the textbook is clear that archival work has historically been treated as a sub-discipline of history rather than as qualitative methodology.

**Kiffer:** Right. And they think that boundary has cost the social sciences. The methods of qualitative analysis we'll learn — theme identification, content analysis, narrative analysis, grounded theory — all apply to archival data. A clinic chart can be coded. A newspaper editorial can be theme-analyzed. A corpus of council minutes can be content-analyzed for keyness over time. The analytic discipline is exactly the same as for interview transcripts. The data just arrived without an interviewer in the picture.

**Sarah:** Third sub-type. Secondary qualitative datasets.

**Kiffer:** Datasets generated by other researchers for other purposes that you reanalyze for your question. This is methodologically tricky because the data were collected with a different research question in mind. The interview guide that produced them was framed for a different analysis. The participants consented to that study, not yours. Secondary qualitative analysis has its own ethics literature, and the operative question is whether your reanalysis is consistent with the consent participants gave.

**Sarah:** And there's something worth saying upfront about the ethics of indirect work.

**Kiffer:** Yeah. Indirect data feel ethically clean because there's no participant in front of you to consent. That feeling is often wrong. The records were generated by real people who didn't consent to your study, and the people they describe may be identifiable even from de-identified records. Secondary qualitative analysis has its own ethics literature for exactly this reason. The operative question is whether your reanalysis is consistent with the consent the original participants gave. If they consented to a one-time study of vaccine attitudes, repurposing their transcripts for a study of loneliness is a stretch the original consent may not cover. The Tri-Council Policy Statement, TCPS two, takes secondary use seriously.

**Sarah:** And in practice, REBs increasingly require explicit consent for secondary qualitative use.

**Kiffer:** They do. The lesson here is that "indirect" does not mean "non-human-subjects." It often does mean "no living person in the room." But the ethical considerations don't vanish. They migrate.


**Kiffer:** Datasets generated by other researchers for other purposes that you reanalyze for your question. This is methodologically tricky because the data were collected with a different research question in mind. The interview guide that produced them was framed for a different analysis. The participants consented to that study, not yours. Secondary qualitative analysis has its own ethics literature, and the operative question is whether your reanalysis is consistent with the consent participants gave.

**Sarah:** And the textbook flags one more advantage of indirect data that I want to make sure we name. It can let you study things ethics can't otherwise reach.

**Kiffer:** Yeah. Some populations and some questions are not accessible to elicitation. You can't interview people who died alone. You can't easily interview people who are too cognitively impaired by late-stage dementia to consent. You can't interview people whose loneliness is so absolute that they would never agree to participate. Indirect data — coroner reports of solitary deaths, chart notes on cognitively impaired residents, internet traces of social withdrawal — can study what elicitation can never reach. That's not a methodological add-on. For some questions, indirect data are the only ethical access.

**Sarah:** Which makes the selection-bias problem worth naming.

**Kiffer:** Right. Indirect data come from systems that weren't built to study your question. The records that exist exist because someone else's process generated them. Whose chart gets a loneliness note depends on which clinicians are willing to write one, which patients are willing to be vulnerable in that visit, which institutional cultures encode emotional state in medical records. Selection bias in archival data is profound, and your methods section has to address what's not in the archive as carefully as what is.

**Sarah:** Okay, second family. Direct observation.

**Kiffer:** Direct observation is the family in which the researcher is present and watches what's happening. James Spradley laid down the canonical typology — five positions on a spectrum from complete observer at one end to complete participant at the other. In between are observer as participant, participant as observer, and the moderate participant. Each position trades access against ethical clarity.

**Sarah:** And the canonical anthropological move.

**Kiffer:** Spradley called it explicit awareness. The trained capacity to notice things that ordinary perception filters out. The skill is the qualitative equivalent of learning to read an electrocardiogram. The lines on the paper are the same, but the trained eye sees what the untrained eye misses.

**Sarah:** And he had specific exercises for building it.

**Kiffer:** A few. The everyday-object exercise — spend ten minutes describing in writing the contents of a clinic waiting room you've been in dozens of times. Most students struggle the first three minutes, then find a rhythm. The exercise reveals how little of the familiar setting was ever consciously registered. The naive-observer exercise — pretend you've just arrived from a culture that has never seen a Canadian hospital before. Describe what's happening as if you don't know what a stethoscope is, what a hospital gown is for, why people are wearing masks. The discipline is to make the familiar strange. That's the engine of ethnographic insight.

**Sarah:** And there's an embodied piece too.

**Kiffer:** Yes. The body-knowledge exercise. Notice the bodily sensations the setting produces in you. The smell, the temperature, the acoustics, the pace. Your own discomfort or ease. Ethnographic insight is partly embodied. What your body notices is data, even if it never makes it into the final report verbatim.

**Sarah:** And the lesson connects this back to reading transcripts.

**Kiffer:** Right. You won't do participant observation for the capstone — the dataset is pre-collected. But the discipline of explicit awareness applies to reading transcripts as well. When Maya talks about the SkyTrain at nine on Sunday, an ethnographic reader notices it as an everyday detail. What is the SkyTrain at that hour, what does it sound like, who's on it, what does the silence among passengers feel like? The trained ethnographic eye is the same eye you should be reading transcripts with.

**Sarah:** Two more direct-observation methods worth mentioning.

**Kiffer:** Continuous monitoring is the structured cousin of participant observation. The observer watches a specific behavior stream — not the whole setting — and records it systematically. The World Health Organization's "Five Moments" hand-hygiene observation protocol is the canonical example. The famous finding here is that when clinicians are asked how often they wash their hands between patients, they say roughly ninety percent of the time. When observed, the rate is closer to forty percent. The textbook's point is that some behaviors — especially high-frequency, low-salience, embodied behaviors — cannot be accurately self-reported.

**Sarah:** And spot observation.

**Kiffer:** The sampling-based cousin. Brief observations at randomly sampled times, recording who is doing what in that instant. Aggregated, the data yield a time-allocation profile. What proportion of a population's time, on average, is spent on each activity. The classical use is in cultural-anthropology economic research — how do subsistence-farming households allocate labour. The method translates directly to public-health questions about informal caregiving, sedentary behavior, child supervision.

**Sarah:** And there's the reactivity problem with direct observation.

**Kiffer:** The Hawthorne effect. When people know they're being watched, they often change their behavior, sometimes briefly and sometimes for the duration. The textbook recommendation is to acknowledge this explicitly, extend observation periods so the effect attenuates, and triangulate with indirect data.

**Sarah:** Okay. Third family, and the big one. Elicitation.

**Kiffer:** Bernard, Wutich, and Ryan spend most of chapter four on this family because it dominates contemporary qualitative health research. The family contains several quite different methods. Unstructured, semi-structured, structured interviews. Focus groups. Free lists, pile sorts, triads, ranking tasks. Open-ended survey items. Each has a different relationship between researcher control and participant freedom. Each is good for different kinds of questions.

**Sarah:** Start with unstructured interviewing.

**Kiffer:** The form closest to ordinary conversation. No fixed script. The interviewer has a topic in mind and follows the participant's lead, asking probes responsively as the conversation unfolds. Foundational in cultural anthropology — Spradley's Ethnographic Interview from nineteen seventy-nine is the classic methodological treatment. Used heavily in long-term fieldwork. The strength is that the participant sets the agenda. They name the topics they think matter. They choose the words. They impose their categories before the researcher imposes theirs.

**Sarah:** And the weakness.

**Kiffer:** Sacrifices comparability across cases. If you and I both interview ten people unstructured, we've asked ten different sets of questions. Whatever we say about the pattern across cases rests on a fragile assumption that the cases are about the same thing. For most public-health qualitative work, where the researcher wants to compare across participants on identifiable dimensions, unstructured interviewing is the wrong tool. Or it's used only in early scoping.

**Sarah:** Second form. Semi-structured. And this is the method that produced our dataset.

**Kiffer:** Right. The workhorse of contemporary qualitative health research. Semi-structured sits between unstructured and structured. It uses a written guide that covers the topics the researcher wants to address, but treats the guide as a checklist rather than a script. The interviewer follows the participant's lead within each topic, asks probes responsively, skips items the participant has already covered, reorders items as needed.

**Sarah:** And the defining feature.

**Kiffer:** Guide but not script. The interviewer should be able to conduct the interview without looking at the page. The guide is internalized, not read. Why? Because reading questions aloud creates a school-test register that suppresses the discursive material the method is meant to elicit.

**Sarah:** The loneliness guide in our dataset is a worked example. Walk listeners through its design features.

**Kiffer:** The guide is organized around six conceptual domains. Defining and recognizing loneliness. Triggers and patterns. Meaning and identity. Responses and coping. Social world. Systems and policy. Each domain has a small number of main questions, each with two or three italicized probes. Three design features matter.

**Sarah:** First.

**Kiffer:** Open-ended main questions. "When I say the word loneliness, what comes to mind for you?" invites the participant to bring their own framing first.

**Sarah:** Second.

**Kiffer:** Targeted probes underneath. "Is loneliness the word you would use yourself, or would you use something different?" That's the probe that produced Amira's response about the Arabic word wahda. The probe opened a door the main question alone would have left closed.

**Sarah:** Third.

**Kiffer:** Domain order. The guide moves from definition to triggers to meaning to coping to social world to policy. The order is deliberate. Definition first, before researcher framings contaminate it. Policy last, when the participant is warmed up enough to step back from their own experience.

**Sarah:** And there's a whole sub-art on probes.

**Kiffer:** Probes are the secondary questions the interviewer uses to invite expansion, clarification, or specification. A probe is not a planned item. It's a responsive intervention. The textbook identifies several probe types worth naming.

**Sarah:** Walk us through them.

**Kiffer:** Silent probe. The interviewer says nothing after the participant's response, inviting them to continue. When Maya pauses after "the SkyTrain at nine on a Sunday," an experienced interviewer waits. Often the most analytically valuable material comes in the second beat.

**Sarah:** Echo probe.

**Kiffer:** The interviewer repeats a fragment of what the participant just said with a slight rising intonation, inviting expansion. Participant says "it feels like fading." Interviewer says "fading?" Helen's elaboration about fading at the edges comes from such a probe.

**Sarah:** Uh-huh probe.

**Kiffer:** Minimal verbal acknowledgement that doesn't introduce new content but signals attention. Used throughout the corpus. Tell-me-more probe is an explicit invitation to expand. Long-question probe is a longer, more elaborated question that gives a reticent participant more to attach to. And then there's the leading probe, which is the one to avoid.

**Sarah:** Why avoid it.

**Kiffer:** Because it suggests the answer the interviewer expects. "So you'd say loneliness is mostly an emotional thing?" after the participant has not said that, closes off everything except agreement or disagreement with the frame. The textbook is explicit that leading probes contaminate the data.

**Sarah:** And there's a disciplinary point about when not to probe.

**Kiffer:** That's worth dwelling on. With Amira, the response to a probe about Canadian friends — "to explain everything is to live through it again" — signals that the probe touched material the participant did not want to elaborate on. The interviewer correctly did not push. The discipline of qualitative interviewing is partly the discipline of not probing. Recognizing when to step back, when silence is what the participant needs, and when the probe would extract material the participant would later regret giving.

**Sarah:** Quick callback to the earlier lessons. The probe types we just walked through are doing a piece of the systematic commitment from Lesson 1.

**Kiffer:** Yeah. Probing well across interviews is what makes the data comparable in the first place. A semi-structured study where every interview is improvised differently doesn't support cross-case analysis. A semi-structured study where the same kinds of probes appear in response to similar participant moves does. Even within the responsive flexibility of the method, there's discipline. And that discipline is partly what makes the corpus a corpus rather than a set of disparate conversations.

**Sarah:** And it's worth saying that this is where interviewer training matters.

**Kiffer:** Hugely. Two researchers who each interview ten participants without coordinated training will produce twenty quite different interviews. Two researchers who've done training together — joint review of practice interviews, calibration sessions, discussion of probe choices — will produce a more methodologically coherent corpus. The capstone dataset was generated by trained interviewers, and the methods section should acknowledge that. Interviewer training is part of the procedural transparency you owe a reader.

**Sarah:** Structured interviewing. The third form.

**Kiffer:** When the guide becomes the script. Every participant asked the same questions in the same order, no responsive probing. The form blurs into survey research at the structured end. The textbook treats structured interviewing as legitimate and underused in qualitative methods. A grounded-theory study that's matured to the point of needing to test specific propositions can benefit from a structured-interview phase, even if early phases were semi-structured.

**Sarah:** And then focus groups.

**Kiffer:** Focus groups are a qualitatively distinct method, not just "an interview with more people." The defining feature is that the data are produced by the group interaction. Participants respond to each other. They agree, disagree, build on, qualify, and contest one another's framings. The data are dialogical.

**Sarah:** Krueger's focus-groups handbook is the canonical methodological treatment.

**Kiffer:** Fifth edition twenty fifteen. The methodological move is to treat the group as the unit, not the individual. Five situations where focus groups outperform individual interviews. First, when you want to surface the range of views in a community — the group format brings out positions individuals might not articulate alone. Second, when the topic is best discussed collectively — community priorities, policy preferences, programme evaluation. Third, when you want to observe how categories are negotiated in real time — what becomes consensus, what gets contested, what is unspeakable. Fourth, when you have limited time and the question is breadth more than depth — six focus groups of eight people produce a dataset of forty-eight voices in roughly the same time as eight individual interviews. Fifth, when the population is reluctant to speak alone.

**Sarah:** And when they're the wrong choice.

**Kiffer:** When the topic is too sensitive for collective disclosure. Sexual behaviour, intimate-partner violence, suicidality, stigmatized illness. When the participants are in unequal power relationships with each other. Boss and subordinate. Parent and adolescent. When the population is too small to recruit groups. When the research question requires sustained individual narrative that group dynamics would interrupt.

**Sarah:** For loneliness specifically, the choice is non-obvious.

**Kiffer:** Right. Individual interviews allow the participant to disclose the intimate and stigmatizing dimensions of loneliness. Maya's comment "loneliness feels like admitting I'm failing at being twenty-two" would be hard to say in a group. Focus groups, on the other hand, would let you see how loneliness is talked about collectively — what the public vocabulary is, what's performable, what stays unsayable. A study using both methods would learn things neither alone could reveal.

**Sarah:** And moderator skills.

**Kiffer:** Focus-group moderating is a trained skill, not a transferable interviewing skill. The moderator must keep the discussion on topic without scripting it, ensure the quiet participants speak and the loud ones don't dominate, attend to non-verbal dynamics, and manage the ethics — including the impossibility of guaranteeing confidentiality from other participants. Krueger emphasizes that many experienced one-on-one interviewers struggle the first time they moderate a group.

**Sarah:** Let me push on one more thing about focus groups. Because there's a methodological subtlety that gets missed.

**Kiffer:** Sure. The subtlety is that the data are dialogical, but the analysis is often treated as if it were monological. What I mean is, lots of focus-group papers report findings as if each participant's quotes could be detached from the group conversation and analyzed individually. That misses what makes focus-group data interesting. The "data" in a focus group is the negotiation. When participant A says something and participant B builds on it and participant C contests it, the analytic unit is the sequence, not any one turn. A methods section that treats focus-group data as a bigger pile of interview quotes has lost the method's particular contribution. The decent move is to analyze interaction itself — what gets affirmed, what gets resisted, what gets silenced.

**Sarah:** And that's a different analytic mode than thematic analysis of interview data.

**Kiffer:** It is. Some of the discourse-analytic tools we'll meet later in the course are better suited to focus-group data than the standard thematic-analysis toolkit. Worth knowing if you ever run focus groups.

**Sarah:** Quick word on cultural-domain elicitation methods, because they'll come back later in the course.

**Kiffer:** A family of methods specifically designed to surface the structure of a cultural domain — how members of a community categorize, order, and relate concepts within a topic area. Four to learn now. Free listing — ask a sample to list "all the kinds of loneliness you can think of." The frequency and average position together index cultural salience. Pile sorts — participants sort cards into piles of things that go together. Triad tests — three items at a time, which two are most similar. Ranking tasks — order a set along a specified dimension. These methods are powerful for studying structure but structurally rich and phenomenologically shallow. You learn how loneliness is categorized, not what it feels like.

**Sarah:** And the humble open-ended survey item.

**Kiffer:** The most underused qualitative-data source in public health. Often included for completeness rather than analyzed. A two-thousand-respondent survey with a forty-percent response rate to one open-ended item gives you eight hundred qualitative responses. That's a respectable corpus.

**Sarah:** Okay, last big chunk. Transcription and field notes. Both treated as analytic acts.

**Kiffer:** Right. Not neutral preparation. Transcription is the conversion of recorded speech into written text. The conversion seems mechanical on first encounter. Type what you hear. In practice, transcription requires hundreds of small interpretive decisions per transcript. Should "um" be transcribed? What about "you know" and "like" when they're filler? Should false starts be preserved or smoothed? Should pauses be timed? Should laughter be marked, and if so how? Should overlapping speech be shown with brackets? Each decision shapes what later analysis can see.

**Sarah:** Three levels of transcription.

**Kiffer:** Jefferson notation, intelligent verbatim, clean verbatim. Jefferson notation preserves every audible feature. Pause durations to the tenth of a second. Overlaps marked with brackets. In-breath and out-breath marked. Intonation contours, stress, latching, audible laughter inside words. The convention's whole point is to preserve everything. Used for conversation analysis, discourse analysis, the close study of interactional sequencing.

**Sarah:** Intelligent verbatim.

**Kiffer:** Preserves the participant's words including most filled pauses, false starts, repetitions, and laughter — marked in brackets as "laughs" or "laughter." Long pauses noted. Cleaned of typos and grammar errors that are clearly transcription artefacts. Most contemporary qualitative health research uses this level. Including thematic analysis, content analysis, grounded theory.

**Sarah:** And clean verbatim.

**Kiffer:** The substantive content in grammatical sentences. Filled pauses, false starts, repetitions, most non-fluent features removed. Speech is cleaned into readable prose. Used when the analytic interest is in propositional content, not the texture of speech.

**Sarah:** And our dataset.

**Kiffer:** Intelligent verbatim. You can tell because Maya's "nobody — nobody acknowledges" survives into the transcript, complete with the false start. A clean-verbatim transcript would have collapsed that to "nobody acknowledges." A Jefferson transcript would have added pause timing and intonation marks.

**Sarah:** And the methodological commitment is.

**Kiffer:** The level of transcription should be matched to the analytic intent. If you plan content analysis or thematic analysis, light transcription is sufficient. If you plan discourse or conversation analysis, you need much heavier transcription. If you don't yet know which method you'll use, err on the side of more detail. You can simplify a detailed transcript later. You cannot recover what was never written down.

**Sarah:** One more thing on transcription that I want to make sure we name. The choices made before you saw the dataset have already shaped what you can do with it.

**Kiffer:** That's the point worth dwelling on. Because in your capstone, the transcription was done before you got the data. Someone decided what to preserve. Someone decided what to clean. Those choices are not visible to you in retrospect, but they bound what analyses you can perform. If the transcription was clean verbatim and the false starts are gone, you cannot do conversation analysis on these transcripts, full stop. The methodological lesson is to be honest in your methods section about the level of transcription you're working with and what that level enables and forecloses.

**Sarah:** And there's a related point about transcription as power.

**Kiffer:** Yeah. Whoever transcribed decided what counted as participant speech worth preserving. They decided which sighs were notation-worthy and which were not. They decided whether to mark the moment Amira's voice cracked. Those decisions reflect somebody's hearing of what mattered, and they happened before the analyst arrived. The discipline is to know it and to acknowledge it.

**Sarah:** And field notes.

**Kiffer:** Emerson, Fretz, and Shaw — Writing Ethnographic Fieldnotes — laid down the canonical progression. Jottings, expanded notes, analytic memos. Jottings are quick written records made in the field or immediately after — single words, phrases, sketches. Just enough to anchor memory. Expanded notes are written within twenty-four hours, usually the same day, turning jottings into descriptive prose. Analytic memos come later — your own developing interpretations, tied to specific field events. The discipline is the progression. Jotting alone is just a list. Memo without jotting is just speculation. The progression keeps the analysis honest.

**Sarah:** Let me try the synthesis. First takeaway. There are three families of qualitative data collection — indirect, direct, elicitation. Each answers different questions and comes with different costs and ethical obligations.

**Kiffer:** Second. Indirect data — behavior traces, archives, secondary qualitative datasets — are nonreactive and longitudinal in ways elicitation can't match. The cost is you can't ask why. Combining indirect with elicitation is often the strongest move.

**Sarah:** Third. Direct observation runs from complete observer to complete participant. The skill the method demands is explicit awareness, and that skill transfers to reading transcripts. The Hawthorne effect is the reactivity problem to acknowledge and design around.

**Kiffer:** Fourth. Elicitation includes unstructured, semi-structured, structured interviews, plus focus groups, cultural-domain methods, and open-ended survey items. Semi-structured is the workhorse of contemporary qualitative health research and the method that produced our dataset.

**Sarah:** Fifth. The art of the probe is the art of the semi-structured interview. Silent, echo, uh-huh, tell-me-more, long-question — each invites a different kind of expansion. Leading probes contaminate the data. And knowing when not to probe is part of the discipline.

**Kiffer:** Sixth. Focus groups outperform interviews when the question is about collective negotiation of categories, but they're the wrong choice for sensitive topics or unequal power relationships. Moderating a focus group is a distinct skill.

**Sarah:** Seventh. Transcription is an analytic act. The level of transcription — Jefferson, intelligent verbatim, clean verbatim — must be matched to the analytic intent. Err on the side of more detail; you can always simplify later.

**Kiffer:** And eighth. Field notes follow a progression — jottings, expanded notes, analytic memos. The progression is what keeps the analysis honest.

**Sarah:** Anything you want listeners to carry into next lesson?

**Kiffer:** One thing. The data collection has already happened for the capstone. But the methodological vocabulary you've just heard is what makes the dataset legible to your reader. Your methods section will describe a semi-structured interview design that produced an intelligent-verbatim corpus of twenty transcripts. That sentence does a lot of work. It tells the reader what kind of analytic moves the data can support and what they can't.

**Sarah:** Next lesson, we turn to analysis. Themes and codebooks. The Ryan and Bernard twelve techniques for finding themes. The structured codebook. Coding mechanics. Intercoder reliability and when it's the wrong measure.

**Kiffer:** Before then, read six to eight transcripts straight through. Make a familiarisation log. Don't code yet. Just read and note what you're seeing.

**Sarah:** Thanks for listening. We'll see you in Lesson 5.

**Kiffer:** Take care of yourselves. See you in class.