Questionnaire
Design
Fundamental Epidemiological Concepts and Approaches
Kiffer G. Card, PhD, Faculty of Health Sciences, Simon Fraser University
Learning objectives for this lesson:
- Plan a questionnaire with appropriate content
- Write well-crafted questions for a questionnaire
- Format the questionnaire for ease of administration and coding
- Pre-test the questionnaire to identify problems
- Administer the questionnaire to maximise response rate
- Code data from the questionnaire as a precursor to data entry
This course was developed by Kiffer G. Card, PhD, as a companion to Dohoo, I. R., Martin, S. W., & Stryhn, H. (2012). Methods in Epidemiologic Research. VER Inc.
Planning a Questionnaire
⏱ Estimated reading time: 12 minutes
Learning Objectives
- Define questionnaire and survey, and distinguish between them.
- Explain the key steps in planning a questionnaire.
- Compare qualitative and quantitative questionnaire types.
- Evaluate the advantages and disadvantages of different administration methods.
What Are Questionnaires and Surveys?
Questionnaires are among the most commonly used tools for collecting data in epidemiological research. The terms questionnaire and survey are often used interchangeably, but they have distinct meanings.
Why Quality Matters
As a primary means of data collection, questionnaires play a significant role in the quality of epidemiological research. However, less attention is often paid to the data-collection process than to the analytical methods used later. A well-designed questionnaire is essential for producing reliable, valid data.
Study Objectives and Planning
Effective questionnaires require careful planning. The very first step is to establish the objectives and information requirements of the study. This involves consultation with subject experts, with the ultimate users of the information, and ideally with members of the population to be surveyed.
Clearly identify what information the study needs to collect. What are the key research questions? What variables need to be measured? Consulting with subject experts and stakeholders at this stage helps ensure all important areas are covered.
Engage with subject experts and the end users of the data. If the data will be used by policymakers, they should be part of the planning process. A structured process like the Delphi technique can help build consensus among diverse stakeholders.
Search for previously published questionnaires on the same topic. These are especially valuable if they have undergone formal validity assessments. Building on validated instruments can save time and improve the quality of your data collection.
Members of the population to be surveyed should also be consulted during the planning phase. Their input can reveal important issues, concerns, and language preferences that researchers might otherwise overlook.
Focus Groups
Focus groups are a valuable tool during questionnaire planning. They consist of 6 to 12 people drawn from the intended study population and/or end users of the information. An independent moderator guides discussion to ensure it stays on track and that no single individual dominates.
What Focus Groups Offer
Focus groups provide insight into attitudes, opinions, concerns, and experiences. They help clarify objectives, identify data requirements, define salient terminology and concepts, and surface research issues that need to be addressed. It is often helpful to record the discussion for later review.
Types of Questionnaires
Questionnaires can be broadly classified as qualitative or quantitative:
Qualitative (Explorative) Questionnaires
Qualitative questionnaires are sometimes called explorative questionnaires. They consist primarily of open questions that allow participants to express their views and thoughts freely. These are used during the hypothesis-generation phase of research when researchers need to identify as many issues as possible related to the topic. They are often administered through interviews and may be audio or video recorded.
Quantitative (Structured) Questionnaires
Quantitative, or structured, questionnaires are designed to capture specific information about study subjects and their environment using mostly closed questions. The responses can be easily coded and analysed statistically. These are the primary focus of questionnaire design in epidemiological research.
Methods of Administration
How a questionnaire is administered can have a substantial impact on both the response rate and data quality. Each method has its own strengths and limitations.
Click each card to learn more about each method:
InterviewsClick to learn more
InterviewsClick to learn more
QuestionnairesClick to learn more
SurveysClick to learn more
Comparison of Administration Methods
| Feature | In-Person | Telephone | Internet | |
|---|---|---|---|---|
| Response Rate | High | Moderate–High | Low–Moderate | Low–Moderate |
| Cost | High | Moderate | Low | Low |
| Interviewer Bias | Possible | Reduced | None | None |
| Geographic Reach | Limited | Moderate | Wide | Wide |
| Visual Aids | Yes | No | Limited | Yes |
| Literacy Required | No | No | Yes | Yes |
Reflection
Consider a research question in your area of interest. Which method of questionnaire administration would be most appropriate, and why? What trade-offs would you need to consider?
Minimum 20 characters required.
Key Takeaways
- Questionnaires are data-collection tools; surveys are observational studies that use them.
- Effective planning involves defining objectives, consulting stakeholders and focus groups, and reviewing existing instruments.
- Qualitative questionnaires explore issues; quantitative questionnaires capture structured data.
- Administration method significantly affects response rates, costs, and potential for bias.
1. What is the primary difference between a questionnaire and a survey?
2. Why are focus groups valuable during questionnaire planning?
3. Which administration method eliminates interviewer bias while maintaining the ability to use visual aids?
✦ Pass the knowledge check with 100% and complete the reflection to continue
Designing Questions: Open & Closed Formats
⏱ Estimated reading time: 15 minutes
Learning Objectives
- Describe the four cognitive steps involved in responding to a question.
- Explain when and how to use open questions effectively.
- Distinguish among the four main types of closed questions.
- Identify the advantages and limitations of closed versus open question formats.
The Cognitive Process of Answering Questions
When a respondent encounters a question, they go through four distinct cognitive steps. Understanding these steps helps researchers design questions that produce accurate, reliable answers.
Four Steps of Responding
1. Understanding — Will the respondent understand the question? It must be clearly worded in non-technical language.
2. Retrieval — Will the respondent know or be able to recall the answer? If additional information is required, they might skip the question or fabricate a response.
3. Judgement — If the question involves a subjective decision (such as opinions or beliefs), is there a way to make it less subjective? Special care in question design is needed to capture such information reliably.
4. Communication — Are the possible responses clear, with an appropriate method of recording the answer?
Once you have drafted a question, ask yourself these four questions to evaluate it: Will the respondent understand it? Will they know the answer? If it requires judgement, can you make it more objective? Are the response options clear and appropriate?
Open Questions
Open questions (also called open-ended questions) place no restrictions on the type of response expected. They are more commonly used in qualitative research because the responses may not be easily standardised for statistical analysis. However, they can be valuable in quantitative research as well.
One common type of open question in quantitative research is the fill-in-the-blank question, used especially for capturing numerical data. For example: “How many people live in this household? ___”
Where possible, capture numerical data as a continuous variable rather than using pre-defined ranges. Knowing someone is exactly 47 years old is more informative than knowing they fall in the 40–59 age bracket. Continuous data can always be categorised later during analysis.
When seeking sensitive information (such as total family income), respondents may be more willing to indicate a category or range rather than an exact value. In these cases, using ranges is an acceptable compromise to encourage honest responses.
When capturing numerical data, always specify the units being used (kilograms, pounds, centimetres, etc.) and consider offering a choice of measurement scales to accommodate different respondents.
Sometimes a “comments” section can be attached to a closed question to allow respondents to elaborate or express their opinion. This combines the advantages of both question types and can capture unexpected information.
Closed Questions
In closed questions (also called closed-ended questions), the respondent selects from a pre-defined range of options. Closed questions are generally easier for respondents to answer and easier to code for data entry. However, they risk oversimplifying issues or forcing respondents into categories that do not fully represent their views.
Explore the four main types of closed questions:
Checklist Questions
In a checklist question, the respondent checks all options that apply. The options do not need to be mutually exclusive or jointly exhaustive. Each option is treated as a separate yes/no variable in the database.
Example: “Which of the following symptoms have you experienced? (Check all that apply)” — followed by a list of symptoms with checkboxes.
This is equivalent to having a series of individual yes/no questions for each category.
Two-Choice and Multiple-Choice Questions
Multiple-choice questions require categories that are mutually exclusive (no overlap) and jointly exhaustive (cover all possibilities). Adding an “Other (please specify)” category ensures exhaustiveness.
It is recommended to limit choices to 5 options for in-person/telephone interviews and 10 for mailed/internet questionnaires. Be aware that respondents may favour items at the top of a list; varying the order across different versions of the questionnaire can help mitigate this.
Example: “How do you draw water from the cistern? (Select one)” — with options like “With a bucket,” “With a manual pump,” “With an electric pump,” “Other (please specify).”
Rating Questions
Rating questions ask the respondent to assign a value on a pre-defined scale. The most common form is the Likert scale, where respondents indicate their level of agreement (e.g., strongly agree, agree, neutral, disagree, strongly disagree).
Key considerations:
- Use a minimum of 5 to 7 categories to avoid serious loss of information.
- Decide whether to include a neutral midpoint or use a “forced-choice” scale (even number of categories).
- Always include a “don’t know” or “not applicable” option to distinguish non-responses from true neutrality.
- Parametric statistics (mean, standard deviation) should only be computed if there are at least 5 points on the scale and the points can be assumed to be equally spaced.
A Visual Analogue Scale (VAS) is a special type of rating question where respondents mark a point on a continuous line between two extremes (e.g., “terrible” to “excellent”). VAS is widely used for pain measurement in clinical research.
Ranking Questions
Ranking questions ask respondents to order options by priority or importance. They can be difficult to complete, especially when the list is long, because all categories must be held in mind simultaneously.
Important limitations:
- Rank intervals are unknown — the difference between ranks 1 and 2 may not equal the difference between ranks 2 and 3.
- Respondents may assign “tied” ranks, and decisions about how to handle these must be made in advance.
- If categories are omitted, it may influence how respondents rank the remaining options.
- For in-person interviews, using physical cards that respondents can sort may simplify the ranking process.
Open vs. Closed: Key Trade-Off
Open questions capture richer, more nuanced data but are harder to analyse. Closed questions are efficient to code and analyse but may miss important details or force respondents into inappropriate categories. The best questionnaires often use a thoughtful combination of both.
Key Takeaways
- Four cognitive steps govern how respondents process and answer questions: understanding, retrieval, judgement, and communication.
- Open questions are best for exploration; prefer continuous numerical capture over pre-set ranges where possible.
- Closed questions (checklist, multiple-choice, rating, ranking) each have specific design rules to ensure valid data.
- Rating scales should have at least 5 points; always provide a “don’t know” option to distinguish missing data from neutrality.
1. Which cognitive step involves the respondent determining whether they have the information needed to answer?
2. Why should a researcher capture age as an exact number rather than a range?
3. What distinguishes a checklist question from a multiple-choice question?
✦ Pass the knowledge check with 100% to continue
Wording, Structure & Pre-Testing
⏱ Estimated reading time: 14 minutes
Learning Objectives
- Apply key principles for writing clear, unbiased questions.
- Describe how to structure a questionnaire for maximum effectiveness.
- Explain the importance and methods of pre-testing and validation.
Wording of Questions
The way a question is worded has a major impact on the validity of the results. Poorly worded questions can confuse respondents, introduce bias, or yield uninterpretable data. Here are the essential principles:
Questions should rarely exceed 20 words. Avoid abbreviations, jargon, and complex technical terminology. Always consider the respondent’s level of technical knowledge.
Poor: “How many cases of acute gastrointestinal illness occurred during the time period?”
Better: “How many times did you have diarrhea during January?”
Make questions as specific as possible. If asking about a health event, specify the time frame (e.g., “over the past month”) and clearly define the event. For example, define what constitutes a “new case” of diarrhea — perhaps at least 2 days of normal bowel movements before onset counts as a new episode.
A double-barrelled question asks about two things at once. For example: “Do you think traveller’s diarrhea is a serious problem that people should get vaccinated for?” is really asking two questions — one about the seriousness of the disease and one about vaccination. These should be separated into two distinct questions.
Leading questions suggest a desired answer and can bias responses. For example: “Should people have to suffer from regular bouts of diarrhea because of a bad water supply in this region?” implies that the answer should be “no.”
A more neutral alternative: “Do you think the water supply in this region needs to change?”
Structure of Questionnaires
A well-structured questionnaire guides the respondent through a logical flow and creates a positive experience that encourages completion.
Essential Structural Elements
Introduction: Begin with an explanation of the rationale and importance of the questionnaire, how the data will be used, and assurance of confidentiality. Include an estimate of how long it will take to complete.
Opening questions: Start with questions that build confidence — easy, non-threatening items that put the respondent at ease.
Instructions: Keep them clear and concise. Highlight key instructions using bold or other formatting. Remember that people only read instructions when they think they need help.
The Funnel Approach
Within each section of the questionnaire, questions should follow a funnel approach — starting with broader, more general questions and gradually becoming more specific and focused. This helps respondents warm up to the topic before being asked for detailed information.
Questions should be grouped in sections either by subject (illness, water supply, housing) or chronologically (current health, past week, past month, previous illness). Pairs of questions that capture essentially the same information can be included at different points to verify responses and check internal consistency.
Form Layout and Design
For mailed or online questionnaires, visual appeal and ease of completion are essential. Key layout considerations include:
ResponsesClick to learn more
AppearanceClick to learn more
LengthClick to learn more
Pre-Testing Questionnaires
All questionnaires must be pre-tested before deployment. Pre-testing identifies confusing, ambiguous, or misleading questions, layout problems, and issues with instructions. It also serves to estimate completion time.
Expert Review
The first step is to have colleagues or subject-matter experts review the questionnaire. They can evaluate whether all important issues are covered and identify obvious problems with clarity or logic.
Field Pre-Test
Pre-testing on a small sample of the target population is essential. Have respondents complete the questionnaire as they would in the actual study. This reveals questions that are hard to understand, unanswerable, or that need additional response categories.
Think-Aloud Pre-Test
In a think-aloud pre-test, the respondent narrates their thought process as they work through each question. This provides direct insight into how questions are being interpreted and where confusion or difficulty arises.
Test–Retest Reliability
If feasible, re-administer the questionnaire to the same group of respondents after an appropriate time interval. The interval should be long enough that they do not recall their original answers, but short enough that the underlying information has not changed. This approach assesses the repeatability of the questions. Note: it is only valid if the questionnaire itself has not been modified between administrations.
Validation
Validation assesses whether the questionnaire actually measures what it is intended to measure. Several approaches can be used:
Validation Approaches
Comparison with gold standard: Compare questionnaire responses with directly measured quantities (e.g., comparing dietary intake from a food frequency questionnaire with actual measured intake).
Comparison with established methods: Compare results with those from a well-validated existing instrument.
Repeated administration: When no external standard exists, assess repeatability by administering the questionnaire multiple times to the same respondents.
Method comparison: Evaluate whether different administration methods (e.g., mail vs. telephone) produce comparable results.
Reflection
Find or write a survey question that violates one of the wording rules discussed above (double-barrelled, leading, too complex, or too vague). Explain the problem and rewrite the question to fix it.
Minimum 20 characters required.
Key Takeaways
- Questions should be short (<20 words), specific, and avoid double-barrelled or leading phrasing.
- Structure questionnaires with an introduction, easy opening questions, logical grouping, and a funnel approach within sections.
- Pre-testing (expert review, field testing, think-aloud, test–retest) is mandatory before deployment.
- Validation compares questionnaire responses against a gold standard, established methods, or through repeated administration.
1. What is a “double-barrelled” question?
2. What is the “funnel approach” to questionnaire structure?
3. Why is a test–retest evaluation only valid if the questionnaire has not been modified?
✦ Pass the knowledge check with 100% and complete the reflection to continue
Response Rates & Data Coding
⏱ Estimated reading time: 12 minutes
Learning Objectives
- Identify strategies for maximising questionnaire response rates.
- Explain best practices for coding questionnaire data.
- Describe considerations for computer data entry and management.
Maximising Response Rates
Regardless of the type of questionnaire used, researchers must make efforts to maximise the response rate. Low response rates increase the risk of selection bias — if non-responders differ systematically from responders, the study results may not accurately represent the target population.
A Note on Terminology
The term “response rate” refers to the proportion of study subjects who complete the questionnaire. Despite common usage, this is technically a risk (a proportion), not a rate. However, the term “response rate” is widely used and understood in the research community.
Strategies for Higher Response Rates
The following strategies have been shown to improve response rates. They fall into two categories: questionnaire design factors and administrative strategies.
Questionnaire Design Strategies
- Make objectives clear: Explain the purpose of the study clearly to participants so they understand why their participation matters.
- Professional layout: Ensure the questionnaire has a clear structure and professional appearance that signals credibility.
- Pre-test thoroughly: A well-tested questionnaire is easier and less frustrating to complete. Provide an estimate of the time required.
- Minimise length: Keep the questionnaire as short as possible — aim for under 1,000 words. Shorter questionnaires consistently achieve higher response rates.
- Avoid sensitive questions: Unless absolutely necessary, do not ask for sensitive information that might deter participation.
Administrative Strategies
- Follow-up contact: Send reminders and repeat deliveries to non-responders. This is one of the most effective strategies.
- Incentives: Offer small incentives for completion. Research shows that financial incentives help, though larger payments do not necessarily produce proportionally higher response rates.
- Include a pen: Including a pen with a mailed questionnaire has been shown to significantly increase response rates.
- Return envelope: Provide a stamped, first-class return envelope (not business-reply).
- Advance notice: Notify participants in advance that a questionnaire is coming.
- University sponsorship: Mentioning university or institutional affiliation can increase perceived legitimacy.
- Delivery method: Hand delivery or courier can improve response rates over standard mail.
- Personalise: Address the questionnaire and cover letter personally to the respondent.
- Paper and ink colour: Coloured ink (especially blue ink) and coloured paper may positively affect response rates compared to standard black on white.
Scenario: The Low-Response Study
A research team mails a 15-page questionnaire to 2,000 households. The questionnaire uses small print, complex medical terminology, and includes a business-reply envelope. After 6 weeks, only 22% of questionnaires have been returned.
What design and administrative changes would you recommend to improve the response rate?
Data Coding and Editing
Before administering any questionnaire, researchers must plan how responses will be coded and entered into a database. Good coding practices ensure data integrity and make analysis more efficient.
Key Coding Principles
Designate a single, unique value to represent missing data (e.g., -999). This value should not be a legitimate answer to any question. Never leave missing values blank — doing so makes it impossible to distinguish between items the respondent skipped and items the coder missed.
Be consistent throughout the questionnaire. For dichotomous (yes/no) variables, always use the same coding convention (e.g., 0 = no, 1 = yes). This avoids confusion during data entry and analysis.
Responses should be coded directly on the paper questionnaire or data-capture form. Do not combine coding and data entry into a single step — this increases the risk of errors. Use a distinctive ink colour for coding so it is easy to distinguish the coder’s marks from the respondent’s answers.
Data Entry Software
Computer data entry can be accomplished using general-purpose software such as spreadsheets or using specialised database managers. Each approach has advantages:
| Feature | Spreadsheets | Specialised Software |
|---|---|---|
| Ease of Setup | Easy | Moderate |
| Validation Rules | Limited | Extensive |
| Error Prevention | Low | High |
| Data Integrity Risk | High (column sorting can destroy records) | Low |
| Transfer to Statistics | Manual | Built-in |
Spreadsheet Warning
Use spreadsheets with caution for data entry. Sorting individual columns in a spreadsheet can completely destroy the data by separating a respondent’s answers across different rows. Specialised database managers or statistical software with built-in data-entry modules are generally safer choices for larger studies.
Reflection
Think about a questionnaire you have completed as a respondent (a customer satisfaction survey, a health form, a course evaluation). What aspects of its design made it easy or difficult to complete? Would you change anything based on what you have learned in this lesson?
Minimum 20 characters required.
Key Takeaways
- Low response rates increase selection bias risk; multiple strategies (follow-up, incentives, shorter questionnaires) can help.
- Always use a unique code for missing values; never leave fields blank.
- Maintain consistent coding conventions throughout the questionnaire.
- Specialised software is safer than spreadsheets for data entry, especially for larger studies.
1. Why should missing values never be left blank in a database?
2. Which of the following is the MOST effective strategy for improving response rates?
3. What is the main risk of using spreadsheets for data entry?
✦ Pass the knowledge check with 100% and complete the reflection to continue
Final Review & Assessment
⏱ Estimated time: 20 minutes
Lesson Summary
In this lesson you explored the full lifecycle of questionnaire design in epidemiological research — from initial planning through to data coding. Here is a summary of the key concepts covered:
Section 1: Planning a Questionnaire
Questionnaires are data-collection tools used across epidemiological research. Effective planning requires defining clear objectives, consulting stakeholders and focus groups, and reviewing existing instruments. Qualitative questionnaires explore topics openly, while quantitative questionnaires gather structured, codable data. The method of administration (in-person, telephone, mail, internet) significantly affects response rates and data quality.
Section 2: Designing Questions
Respondents process questions through four cognitive steps: understanding, retrieval, judgement, and communication. Open questions allow free-form responses and are best for exploration. Closed questions (checklist, multiple-choice, rating, ranking) are easier to code and analyse. When capturing numerical data, use continuous values where possible. Rating scales should have at least 5 points, and always include options for non-responses.
Section 3: Wording, Structure & Pre-Testing
Questions should be short, specific, and free from double-barrelled or leading phrasing. Structure questionnaires with an introduction, easy opening questions, and a funnel approach within each section. All questionnaires must be pre-tested through expert review, field testing, and ideally think-aloud and test-retest methods. Validation ensures the questionnaire measures what it intends to measure.
Section 4: Response Rates & Data Coding
Maximise response rates through clear objectives, professional design, follow-up contact, incentives, and appropriate length. For data coding, always use a unique code for missing values, maintain consistent coding conventions, and code directly on the form before data entry. Specialised database software is generally preferable to spreadsheets for data entry.
Final Reflection
Reflection
Imagine you are planning a questionnaire to assess water quality and diarrheal disease in a rural community with limited internet access and varying literacy levels. Describe your approach: what administration method would you use, what types of questions would you include, how would you structure the questionnaire, and what steps would you take to maximise the response rate and ensure data quality?
Minimum 20 characters required.
Final Assessment
This assessment covers all material from Lesson 3. You must score 100% to complete the lesson. Review the feedback for any incorrect answers and try again.
1. What is the primary distinction between a questionnaire and a survey?
2. The Delphi technique is useful during questionnaire planning because it:
3. Which type of questionnaire is most appropriate during the hypothesis-generation phase of research?
4. What is a major disadvantage of mailed questionnaires?
5. The “retrieval” step in responding to a question refers to:
6. Why is it preferable to capture age as an exact number rather than a range?
7. In multiple-choice questions, the categories should be:
8. What is the minimum recommended number of points on a rating scale to avoid serious information loss?
9. Which of the following is a “leading” question?
10. The “funnel approach” to questionnaire structure involves:
11. In a think-aloud pre-test, the respondent:
12. Which validation approach compares questionnaire responses with directly measured quantities?
13. What is the primary reason low response rates are problematic in research?
14. Why should coding and data entry NOT be combined into a single step?
15. Which best summarises the overall message of this lesson?
✦ Complete the final reflection above before submitting