# Lesson 4 — Questionnaire Design (v3 expanded)

*Companion-podcast transcript • Sarah & Kiffer* 
*~5300 words • ~28.7 min audio*

---

**Sarah:** Welcome back to Office Hours. I'm Sarah.

**Kiffer:** And I'm Kiffer. Today we're working through Lesson 4, Questionnaire Design. And I want to set this up carefully, because students sometimes underestimate this lesson. They think questionnaire design is something you do at the end, after you've sorted out the real methodological work.

**Sarah:** And the lesson pushes back hard on that view, doesn't it? It treats questionnaire design as a first-order methodological problem, not an afterthought.

**Kiffer:** Yeah, exactly. The argument is that the questionnaire is where measurement actually happens. It's where exposures, outcomes, and confounders enter your dataset as raw responses. If you wrote a beautiful sampling plan in Lesson 3 but then asked vague, leading, double-barrelled questions, the data won't measure what you think they're measuring. And no amount of clever statistical modelling later can rescue badly designed questions.

**Sarah:** Lesson 3 ended with the recognition that the analysis of a survey depends on its sampling design. Lesson 4 picks up the very next decision. Once you've decided who to sample, you have to decide what to ask them.

**Kiffer:** Exactly. Four sections. Section 1 is planning, including qualitative versus quantitative and the four ways you can administer a questionnaire. Section 2 is open versus closed question formats. Section 3 is wording, structure, and pre-testing. And Section 4 is response rates and data coding.

**Sarah:** Let's start with terminology. People use questionnaire and survey interchangeably, but they aren't the same thing.

**Kiffer:** A questionnaire is the instrument. The actual list of questions. The piece of paper or the web form. A survey is the broader observational study that uses a questionnaire to collect descriptive information about a population. So a questionnaire is a tool. A survey is a study.

**Sarah:** And the lesson notes this matters because it determines what you're evaluating. If you're criticizing a study, you might be criticizing the sampling design of the survey, or the wording of the questionnaire, or both. They're separable problems.

**Kiffer:** Okay, planning. The lesson lays out four steps and then dives into focus groups. First step. Define your objectives. What information does the study need? What are the key research questions? What variables are you going to measure?

**Sarah:** Second step. Consult stakeholders. Subject experts and the eventual end users of the data. If your data is going to be used by policymakers, the policymakers should be in the room when you decide what to ask. And the lesson highlights one specific structured method for this. The Delphi technique.

**Kiffer:** Quick definition, because the name sounds mysterious. The Delphi technique is a structured process for building consensus among a group of experts. You send each expert an anonymous questionnaire about the topic. You aggregate their responses. You send the results back and let them revise. You repeat across several rounds until the group converges.

**Sarah:** And the anonymity matters. Because in a face-to-face meeting, the loudest voice or the most senior person can dominate. Delphi removes that pressure. Each expert responds privately, sees the group, and decides whether to update.

**Kiffer:** Third step. Review existing instruments. Search for previously published questionnaires on your topic. Especially ones that have undergone formal validity assessment. Building on a validated instrument saves time and improves data quality, and it lets you compare your sample to the populations the instrument was tested in.

**Sarah:** And fourth step. Engage the target population. The people you're going to survey should themselves be consulted during planning. Not just because it's ethical, although it is, but because they know their own experience, their own language, and their own concerns better than any researcher does.

**Kiffer:** And the standard tool for engaging the target population is the focus group.

**Sarah:** Six to twelve people, drawn from the intended study population or end users of the information. An independent moderator runs the discussion to keep it on track and to make sure no single individual dominates. Often recorded for later review.

**Kiffer:** And what focus groups give you is qualitative texture. Attitudes. Concerns. The terminology people actually use when talking about the topic. Issues you didn't know mattered. They help you clarify objectives and define salient terms before you ever write a question.

**Sarah:** Then the lesson splits questionnaires into qualitative versus quantitative.

**Kiffer:** Qualitative questionnaires are sometimes called explorative questionnaires. They use mostly open-ended questions. They're for hypothesis generation, when you need to identify as many issues as possible. Often administered as interviews. Often audio or video recorded.

**Sarah:** Quantitative or structured questionnaires are for capturing specific information that can be coded and analyzed statistically. They use mostly closed-ended questions. They are the primary focus of questionnaire design in epidemiology, because epidemiology is mostly a quantitative discipline. But qualitative work absolutely belongs in the toolkit, especially during planning.

**Kiffer:** Then administration methods. Four options, each with real trade-offs.

**Sarah:** First, in-person interviews. Response rates are high. Cost is high. Geographic reach is limited because interviewers have to travel. You can use visual aids and you don't need the respondent to be literate. The biggest concern is interviewer bias, where the interviewer's tone, body language, or follow-up probing subtly changes the responses.

**Kiffer:** Second, telephone interviews. Response rates are moderate to high, though declining as people stop answering unfamiliar numbers. Cost is moderate. Interviewer bias is reduced compared to in-person, but not eliminated. No visual aids.

**Sarah:** Quick aside. The standard method for telephone surveys is Random Digit Dialling, sometimes abbreviated R D D. Random Digit Dialling means generating phone numbers at random within valid area code and prefix combinations rather than working from a list. The advantage is that it can reach unlisted numbers. The disadvantage is the falling response rate as cell phones, screening, and spam filters change how people answer.

**Kiffer:** Third, mailed questionnaires. Response rates are typically low to moderate. Cost is low. No interviewer bias. Wide reach. But you need the respondent to be literate, and you cannot verify who actually filled out the form.

**Sarah:** And fourth, internet or web-based surveys. Response rates are low to moderate. Cost is low. No interviewer bias. Wide reach. Full visual capability. Like mail, you need literacy and digital access, and you can't verify who completed the form.

**Kiffer:** And the choice depends on your population, your topic, and your budget. Sensitive topics often benefit from anonymous methods like mail or internet, because respondents are more honest when there's no one looking. Complex topics often benefit from in-person, because the interviewer can clarify.

**Sarah:** And here's a practical note. The lesson encourages researchers in Canada to re-use validated items from existing major instruments rather than drafting brand-new questions every time.

**Kiffer:** Two reasons. First, validated items have been tested. Their psychometric properties are known. Second, if your items match a national survey's items, you can directly compare your sample to national norms. Major analytic advantage.

**Sarah:** Let's name the big Canadian instruments. The Canadian Community Health Survey, abbreviated C C H S, is the flagship Statistics Canada survey on chronic conditions, mental health, substance use, food security, and access to care. They publish the full instrument and derived variables for every cycle, going back to two thousand and one.

**Kiffer:** And inside the C C H S you'll find specific validated scales worth knowing by name. The Patient Health Questionnaire 9, P H Q 9, is a nine-item depression screener. The K6 is a six-item screener for non-specific psychological distress, named after the psychiatric epidemiologist Ronald Kessler. The food security module is the Canadian adaptation of the United States household food security survey module.

**Sarah:** Then there's the Canadian Health Measures Survey, C H M S, structured around an in-clinic visit. Direct measurement of physical activity using accelerometers, dietary recall, biomarker collection. So if your study needs measured rather than self-reported data, the C H M S protocols are the place to start.

**Kiffer:** The Public Health Agency of Canada, P H A C, publishes indicator frameworks for mental health and chronic disease surveillance. These specify the exact question wording behind each national indicator, like the Mental Health Surveillance Indicator Framework.

**Sarah:** And in British Columbia specifically, two more sources. The B C Adolescent Health Survey, run by the McCreary Centre Society, is the standard for high-school-age population health data. And the B C Centre for Disease Control, B C C D C, runs behavioural surveillance instruments around sexually transmitted and blood-borne infections, substance use, and similar topics.

**Kiffer:** Bottom line. Before drafting brand-new items, check whether validated questions already exist. The lesson is blunt about this. Don't reinvent the wheel.

**Sarah:** Section 2. Designing questions. The lesson opens with the four cognitive steps a respondent goes through to answer any question.

**Kiffer:** This is borrowed from the cognitive psychology literature on survey response. The argument is that respondents go through four discrete cognitive steps, and each one can fail.

**Sarah:** First step. Understanding. Will the respondent understand the question? It needs to be in non-technical language at their reading level. If the question requires a glossary, you've already lost most of your respondents.

**Kiffer:** Second step. Retrieval. Will the respondent be able to recall or look up the answer? If you ask how many times they visited the doctor in the past year, can they actually remember, or are they going to make up a number that feels plausible?

**Sarah:** Third step. Judgment. If the question is subjective, like an opinion, can you make it less subjective, or give the respondent enough structure that their judgment is consistent?

**Kiffer:** And fourth step. Communication. Are the response options clear? Can the respondent express the answer they actually want to express?

**Sarah:** And the rule the lesson emphasizes is, after you draft any question, run it through those four checks. If any one fails, rewrite. Don't ship the question until all four steps work.

**Kiffer:** Now, open versus closed. Open questions place no restrictions on the response. Closed questions ask the respondent to select from pre-defined options.

**Sarah:** Open questions are more common in qualitative work. But they have specific roles in quantitative work too. The most common type is the fill-in-the-blank for numerical data. How many people live in this household. How many cigarettes did you smoke yesterday. The respondent writes a number.

**Kiffer:** And the lesson lays out an important principle. Where possible, capture numerical data as a continuous variable rather than as pre-defined ranges. Knowing someone is exactly forty-seven is more informative than knowing they fall in the forty to fifty-nine bracket. Continuous data can always be categorized later. You can't recover precise values from ranges.

**Sarah:** Exception. Sensitive data. For things like total household income, respondents may be much more willing to indicate a range than an exact number. You'll get higher item response with ranges, and the loss of precision is worth the gain in honesty.

**Kiffer:** Always specify units. Kilograms or pounds. Centimetres or inches. Don't assume the respondent will use the units you have in mind. And consider letting respondents choose their preferred unit.

**Sarah:** And consider attaching a comments section to closed questions. Respondents can elaborate, capture nuance you didn't anticipate, and surface issues you can analyze qualitatively even within a quantitative survey.

**Kiffer:** Then closed questions. Four types. Checklist, multiple choice, rating, and ranking.

**Sarah:** Checklist questions. Check all that apply. Which of the following symptoms have you experienced. Then a list with checkboxes. The options do not need to be mutually exclusive. They do not need to be jointly exhaustive. And in the database, each option is treated as a separate yes-or-no variable. So a checklist with eight options produces eight yes-or-no variables.

**Kiffer:** Multiple choice questions. Select one. The categories must be mutually exclusive, meaning no overlap, and jointly exhaustive, meaning they cover every possible answer. Adding an Other, please specify category at the end is the standard way to ensure exhaustiveness.

**Sarah:** And the lesson gives some practical caps. Limit choices to about five options for in-person and telephone interviews. Up to ten for mailed and internet questionnaires.

**Kiffer:** The reason for the difference is that in interviews, the respondent has to hold the options in working memory while you read them. After about five, they start forgetting the early ones. On paper or screen, they can re-read the list.

**Sarah:** And there's a known order effect. Respondents tend to favour items at the top of a list. The standard fix is to vary the order across different versions of the questionnaire. Half see one order, half see the reverse, so any order bias averages out.

**Kiffer:** Rating questions. The Likert scale is the most common form. Strongly agree, agree, neutral, disagree, strongly disagree. Named after Rensis Likert, an American psychologist who developed it in the nineteen thirties.

**Sarah:** Several design choices to make. First, how many points. Use a minimum of five to seven categories. Fewer than five and you lose so much information that you might as well have asked yes or no.

**Kiffer:** Second, neutral midpoint or forced choice. With an odd number of points, like five or seven, you have a midpoint that means neutral. With an even number, like four or six, you don't, so respondents have to commit one direction or the other. Forced choice is useful when you suspect respondents will park on neutral to avoid thinking.

**Sarah:** Third, always include a separate don't-know or not-applicable option, distinct from the rating itself. Because if you collapse don't-know with neutral, you can't distinguish two very different things. Someone who has thought carefully and feels neutral is not the same as someone who hasn't thought about it at all.

**Kiffer:** And one critical statistical caveat. Parametric statistics, like means and standard deviations, should only be computed on a Likert scale if there are at least five points and the points can reasonably be assumed to be equally spaced. Otherwise, treat the data as ordinal. Use medians and non-parametric tests.

**Sarah:** And the equally-spaced assumption is doing a lot of work in that sentence. The distance from strongly disagree to disagree may not equal the distance from disagree to neutral. If the spacing is uneven, the mean is meaningless.

**Kiffer:** Then the visual analogue scale, or V A S. The Visual Analogue Scale is a continuous version of the rating question. The respondent marks a point on a continuous line between two extremes. For example, a ten-centimetre line with terrible at one end and excellent at the other. You measure the distance from the left endpoint.

**Sarah:** V A S is widely used for pain measurement in clinical research, because pain has finer gradations than a five-point scale can capture.

**Kiffer:** And the fourth type. Ranking questions. Order these options by priority. Rank these treatment goals from most important to least important.

**Sarah:** Ranking is cognitively hard, especially with long lists. The respondent has to hold every option in mind at once and compare them all. It also has specific limitations.

**Kiffer:** First, rank intervals are unknown. The difference between rank one and rank two may not equal the difference between rank two and rank three. The ranks don't tell you.

**Sarah:** Second, the omitted-categories effect. If you leave out an option that the respondent would have ranked highly, it changes how they rank the remaining options. The ranks are about the choice set, not about absolute importance.

**Kiffer:** And third, decide in advance how to handle ties. Some respondents will assign tied ranks. You need a coding rule before the data come in.

**Sarah:** And one practical tip. For in-person interviews, physical cards that the respondent can sort on a table simplify ranking enormously. Card sorting lets the respondent move things around until the order feels right, instead of trying to hold ten options in their head.

**Kiffer:** Section 3. Wording, structure, and pre-testing. Section 1 was planning. Section 2 was format. Section 3 is the small choices that determine whether your respondents understand a question the same way you do.

**Sarah:** Wording first. Four rules to keep in mind.

**Kiffer:** Rule one. Keep questions short. Rarely exceed twenty words. Avoid abbreviations. Avoid jargon. Avoid technical terminology unless your respondents are technical experts. The lesson contrasts a poor wording, how many cases of acute gastrointestinal illness occurred during the time period, with a better wording, how many times did you have diarrhea during January.

**Sarah:** And the second version isn't just shorter. It's more concrete. A respondent knows what diarrhea is. They may not know what acute gastrointestinal illness is. And January is a specific reference point, not a vague time period.

**Kiffer:** Rule two. Be specific. Specify the time frame. Define the event. If you're asking about new cases of diarrhea, you might define a new case as starting after at least two days of normal bowel movements. Otherwise, respondents will use whatever definition is most salient to them.

**Sarah:** Rule three. Avoid double-barrelled questions. A double-barrelled question asks about two things at once. The lesson uses, do you think traveller's diarrhea is a serious problem that people should get vaccinated for. That's actually two questions. Is it serious. Should people get vaccinated. Split into two.

**Kiffer:** Rule four. Avoid leading questions. Leading questions suggest a desired answer. The lesson example is, should people have to suffer from regular bouts of diarrhea because of a bad water supply in this region. The word suffer, the framing of bad water supply, the rhetorical structure all push the respondent toward no. A neutral version would just ask, do you think the water supply in this region needs to change.

**Sarah:** These four rules sound obvious. But every published questionnaire has examples of all four problems. They're easier to spot in someone else's questionnaire than in your own. Which is why pre-testing matters so much.

**Kiffer:** Structure. A well-structured questionnaire guides the respondent through a logical flow.

**Sarah:** Introduction. The first thing the respondent reads. Explain the rationale. Why this study matters. How the data will be used. Confidentiality assurances. Estimated time required. Don't bury this. If the respondent isn't convinced in the first thirty seconds, they'll close the survey.

**Kiffer:** Opening questions. Start with easy, non-threatening items. Build the respondent's confidence. Don't open with a sensitive question about income or sexual behaviour.

**Sarah:** Instructions. Keep them clear and concise. Use bold to highlight key instructions. The lesson notes, ruefully, that people only read instructions when they think they need help. So write defensively. Make the question itself self-explanatory wherever possible.

**Kiffer:** Then within each section, the funnel approach. Start with broader, more general questions. Gradually become more specific. The funnel warms the respondent up to the topic before asking for detail.

**Sarah:** And questions should be grouped, either by subject, like illness, water supply, housing, or chronologically, like current health, past week, past month. Coherent sections help respondents stay in the right cognitive frame.

**Kiffer:** Plus, paired questions. Include pairs that capture essentially the same information at different points in the questionnaire. If the responses agree, you've got internal consistency. If they disagree, you've got a flag for inattentive responding.

**Sarah:** Form layout. For mailed and online questionnaires, visual appeal matters. Pre-code the responses, meaning print the numerical code next to each option so coding is fast and consistent. Aim for a professional appearance. And keep the optimal length under about a thousand words when possible.

**Kiffer:** Pre-testing. The lesson is firm. All questionnaires must be pre-tested before deployment. Pre-testing identifies confusing questions, ambiguous wording, layout problems, instruction failures, and helps estimate completion time.

**Sarah:** Four pre-testing methods. First, expert review. Have colleagues or subject-matter experts read the questionnaire. They catch obvious problems with clarity, logic, and coverage.

**Kiffer:** Second, field pre-test. Administer the questionnaire to a small sample of the actual target population, under real conditions. They'll find questions that are hard to understand, items they can't answer, response categories that don't fit their reality.

**Sarah:** Third, the think-aloud pre-test. The respondent narrates their thought process as they work through each question. They literally say out loud, this is asking about the past month, but I had two episodes, and I'm not sure if the second one counted. That direct insight into how questions are being interpreted is incredibly valuable.

**Kiffer:** And fourth, test-retest reliability. Re-administer the questionnaire to the same group of respondents after an appropriate interval. Long enough they don't recall their original answers, short enough that the underlying information hasn't changed. This assesses repeatability.

**Sarah:** And the test-retest only works if the questionnaire itself hasn't changed between administrations. If you modify the wording, you don't know whether differences come from poor repeatability, real change, or the modifications themselves. Freeze the instrument between rounds.

**Kiffer:** Then validation. Validation asks whether the questionnaire actually measures what it's supposed to measure.

**Sarah:** Comparison with a gold standard. Compare questionnaire responses with directly measured quantities. The classic example is comparing self-reported dietary intake from a food frequency questionnaire to actually measured intake, where someone weighs every bite.

**Kiffer:** Comparison with established methods. Compare your new instrument to a well-validated existing one. If they agree, your new instrument is probably measuring what the existing one measures. The International Committee of Medical Journal Editors, or I C M J E, recommends this kind of comparison. I C M J E is the body that publishes the uniform requirements for manuscripts submitted to biomedical journals.

**Sarah:** Repeated administration. When no external standard exists, assess repeatability through test-retest. And method comparison. Does the same questionnaire produce comparable results across administration methods? If mail and telephone disagree, the method itself is influencing the data.

**Kiffer:** And there's an interactive simulator in the lesson called the Reliability and Measurement Error simulator. It's worth running.

**Sarah:** It generates a hundred respondents, each with a true underlying attitude score. The questionnaire produces measurements that are corrupted by random noise. So measurement equals true score plus error.

**Kiffer:** And the simulator displays four things. First, the test-retest correlation between two measurements of the same respondents. If reliability is high, the correlation is close to one. If the noise is large, the correlation drops.

**Sarah:** Second, Cronbach's alpha. Named after Lee Cronbach, an American psychometrician. Cronbach's alpha is computed when the simulator generates five items measuring the same underlying construct. It tells you how internally consistent the items are.

**Kiffer:** Third, Cohen's kappa. Named after Jacob Cohen, an American statistician. Cohen's kappa is the agreement between two raters or two administrations, corrected for the agreement you'd expect by chance alone.

**Sarah:** And the fourth thing the simulator shows is the most important pedagogically. The attenuation of the regression coefficient as measurement error grows. In plain words, attenuation means that when you regress an outcome on a noisy version of the exposure, the slope is shrunk toward zero compared to the slope you'd get with a perfectly measured exposure. Random measurement error in the exposure biases your effect estimate downward.

**Kiffer:** And this is a profound point. A lot of students assume random noise just makes things uncertain. They expect bigger confidence intervals, but the same point estimate. The simulator shows the point estimate itself is biased toward the null. You can have a real underlying effect, and a noisy measurement will systematically underestimate it.

**Sarah:** So improving reliability isn't just a quality-of-life upgrade for the data. It's protecting your effect estimates from being biased toward zero. The simulator also shows that adding more items recovers reliability via what's called the Spearman-Brown formula. Multi-item scales beat single items.

**Kiffer:** Section 4. Response rates and data coding.

**Sarah:** First, a quick terminology note. The term response rate is technically a misnomer. It refers to the proportion of study subjects who complete the questionnaire. So it's a proportion, which in epidemiology we'd call a risk, not a rate. A rate has time in the denominator. A proportion doesn't. But the term response rate is so widely used that no one is going to change it.

**Kiffer:** Why does response rate matter? Because low response rates increase the risk of selection bias. If non-responders differ systematically from responders, your sample no longer represents the target population. If healthier people are more likely to respond, your sample looks healthier than the population.

**Sarah:** And the only way to know whether non-response is biasing your results is to know something about the non-responders. Which by definition you usually don't. So the safest move is to maximize response rates. Strategies fall into two categories. Design and administrative.

**Kiffer:** Design strategies first. Make objectives clear in the introduction so respondents understand why their participation matters. Use a professional layout that signals credibility. Pre-test thoroughly so the questionnaire isn't frustrating. Provide an estimate of time required. Minimize length, aiming for under a thousand words. And avoid sensitive questions unless absolutely necessary.

**Sarah:** Then administrative strategies. Follow-up contact. Reminders to non-responders. The lesson identifies this as one of the most effective strategies. People get busy and forget. A reminder pulls them back.

**Kiffer:** Incentives. Small payments for completion. And the lesson notes a counterintuitive finding. Larger payments don't necessarily produce proportionally higher response rates. Smaller payments are often as effective. The signal of being valued seems to matter more than the dollar amount.

**Sarah:** Including a pen with mailed questionnaires. This sounds bizarre, but it's been replicated. A small functional gift creates a sense of reciprocity. And the pen removes one barrier to completing right now.

**Kiffer:** Stamped first-class return envelope, not business reply. The actual stamp seems to matter, both because the respondent feels they shouldn't waste it and because first-class signals respect for their time.

**Sarah:** Advance notice. Tell people the questionnaire is coming before it arrives, so it doesn't look like junk mail. University sponsorship. Mentioning institutional affiliation increases perceived legitimacy.

**Kiffer:** Hand delivery or courier improves response rates over standard mail. Personalisation. Address the cover letter to the specific respondent by name. Generic to-the-resident addressing performs worse.

**Sarah:** And colour choices. Coloured ink, especially blue, and coloured paper rather than standard white have been shown to modestly improve response rates.

**Kiffer:** None of these on its own is a magic bullet. They're cumulative. A questionnaire that uses all of them together will have a substantially higher response rate than one that uses none. So do as many as your budget allows.

**Sarah:** Then data coding. The lesson is firm that you must plan coding before you administer the questionnaire. Three principles.

**Kiffer:** First, missing-value codes. Designate a single, unique value to represent missing data. Negative nine ninety-nine is the standard convention. The value must not be a legitimate answer to any question. And critically, never leave missing values blank.

**Sarah:** Why? Because if a cell is blank, you cannot distinguish between an item the respondent skipped and an item the coder accidentally missed during data entry. If respondents systematically skip a question, that's information about the question. If coders systematically miss a question, that's a data entry error. The unique missing code lets you tell them apart.

**Kiffer:** Second, consistency. Use the same coding convention throughout the questionnaire. For yes-or-no variables, always code zero equals no, one equals yes. Don't switch halfway through. Inconsistency is a fertile source of analysis errors.

**Sarah:** Third, code on the form, not during data entry. The coder fills in the numerical code next to each response on the paper questionnaire. Then a separate data entry clerk types the codes into the database. Don't combine those two steps.

**Kiffer:** Why? Because separating coding from entry creates an audit trail. You can go back to the paper form and check the code. Plus, use a distinctive ink colour for coding so the coder's marks are visually separable from the respondent's answers.

**Sarah:** Then data entry software. The lesson contrasts general spreadsheets with specialised database managers.

**Kiffer:** And it issues a sharp warning about spreadsheets. The big risk is that sorting individual columns can completely destroy data integrity. If you accidentally sort one column without also sorting the rest, the responses for one variable get separated from the responses for the other variables. The same row no longer represents the same respondent. The dataset is corrupted.

**Sarah:** Specialised database managers and statistical software with built-in data entry modules don't allow that. They enforce row integrity. They also support validation rules, like rejecting an age of two hundred or a date of February thirtieth. For larger studies, the specialised tools are much safer.

**Kiffer:** For a small project, a spreadsheet may be fine if you're careful. For a study with thousands of respondents and hundreds of variables, the risks are too high. Use a real database manager.

**Sarah:** Okay. Let me try to pull the takeaways together.

**Kiffer:** Yeah, please go for it.

**Sarah:** First takeaway. Questionnaire and survey are different things. A questionnaire is the instrument. A survey is the broader study. Effective planning starts well before the first question is written. Define objectives. Consult experts and end users, possibly through the Delphi technique. Review existing instruments. Engage the target population through focus groups.

**Kiffer:** Second. Reuse validated Canadian items where possible. C C H S, C H M S, P H A C indicator frameworks, the B C Adolescent Health Survey, B C C D C behavioural surveillance. Don't reinvent items that already have known psychometric properties and existing population benchmarks.

**Sarah:** Third. Choose your administration method deliberately. In-person, telephone, mail, internet. Each has trade-offs around response rate, cost, interviewer bias, geographic reach, visual capability, and literacy requirements.

**Kiffer:** Fourth. Respondents go through four cognitive steps when answering any question. Understanding, retrieval, judgment, communication. Run every question you draft through all four checks. If any one fails, rewrite.

**Sarah:** Fifth. Open versus closed isn't a binary. Open questions are best for exploration and for capturing continuous numerical data at full resolution. Closed questions, the four types being checklist, multiple choice, rating, and ranking, are easier to code and analyze. Each type has specific design rules. Five to seven points minimum on rating scales. Five options maximum for in-person and telephone, ten for mail and internet on multiple choice. Vary order across versions. Always include a separate don't-know option distinct from neutral. Parametric statistics on Likert data only with five-plus equally-spaced points.

**Kiffer:** Sixth. Wording rules are simple but routinely violated. Keep questions short, under twenty words. Be specific about time frame and event definition. Avoid double-barrelled questions. Avoid leading questions.

**Sarah:** Seventh. Structure matters. Introduction with rationale and confidentiality. Easy non-threatening opening questions. Funnel approach within each section. Group questions by subject or chronology. Pair questions for internal consistency checks. Aim for under a thousand words total.

**Kiffer:** Eighth. Pre-testing is mandatory, not optional. Expert review. Field pre-test. Think-aloud. Test-retest with the questionnaire frozen between rounds. Validation through gold standard, established methods, repeated administration, or method comparison.

**Sarah:** And the simulator from the lesson is worth running. The big lesson is that random measurement error in the exposure doesn't just inflate uncertainty. It biases your effect estimates toward the null. Improving reliability protects your point estimates.

**Kiffer:** Ninth. Response rate is technically a proportion, not a rate. Low response rates threaten validity through selection bias. Maximize response rates with both design strategies and administrative strategies. Follow-up reminders. Modest incentives. A pen with the mailed questionnaire. Stamped first-class return envelope. Advance notice. University sponsorship. Hand delivery. Personalisation. Colour choices. None alone is decisive. Stack them.

**Sarah:** Tenth. Coding principles. A unique missing-value code, like negative nine ninety-nine, never blank. Consistency throughout, like zero equals no, one equals yes. Code on the form before data entry, in a distinctive ink colour. And for anything beyond a small project, prefer specialised database managers over spreadsheets. Sorting a single column in a spreadsheet can destroy your data integrity in a way that's hard to recover.

**Kiffer:** And one practical recommendation. The reflections in this lesson are doing real work. The exercise where you identify a question that violates one of the four wording rules and rewrite it is Section 3 in miniature. The exercise where you imagine a questionnaire for water quality and diarrheal disease in a rural community with limited internet and varying literacy is the entire lesson in miniature. Sit with both. The assessment will assume you've done the applied thinking.

**Sarah:** And the through-line for the whole lesson is, every decision in questionnaire design has downstream consequences for what your data can support. The choice of items, the format, the wording, the structure, the pre-testing, the response rate, the coding. All of those are validity decisions. Done well, they preserve the validity work you started in Lesson 3. Done badly, no analysis can recover from them.

**Kiffer:** Next up is Lesson 5. Measures of Disease Frequency. Where we take the data your well-designed questionnaire produces and turn it into the standard quantitative outputs of epidemiology. Prevalence, incidence, mortality rates, and the conversions between them. The two-by-two table you've been preparing for since this material finally arrives in earnest.

**Sarah:** Take care, everyone.

**Kiffer:** See you there.
