“Measure twice. Cut once!” goes the old carpenter adage. Why? Because measuring accurately means you’ll get the outcomes you want!
Same in research. A consistent and accurate measurement will get you the outcomes you want to know. Whether an instrument measures something consistently is called reliability. Whether it measures accurately is called validity. So, before you use a tool, check for its reported reliability and validity.
Psychological researchers do not simply assume that their measures work. Instead, they conduct research to show that they work. If they cannot show that they work, they stop using them.
There are two distinct criteria by which researchers evaluate their measures: reliability and validity. Reliability is consistency across time (test-retest reliability), across items (internal consistency), and across researchers (interrater reliability). Validity is the extent to which the scores actually represent the variable they are intended to.
Validity is a judgment based on various types of evidence. The relevant evidence includes the measure’s reliability, whether it covers the construct of interest, and whether the scores it produces are correlated with other variables they are expected to be correlated with and not correlated with variables that are conceptually distinct.
The reliability and validity of a measure is not established by any single study but by the pattern of results across multiple studies. The assessment of reliability and validity is an ongoing process.
A pilot study is to research what a trial balloon is to politics.
In politics, a trial balloon is communicating a law or policy idea via media to see how the intended audience reacts to it. A trial balloon does notanswer the question, “Would this policy (or law) work?” Instead a trial balloon answers questions like “Which people hate the idea of the policy/law–even if it would work?” or “What problems might enacting it create?” In other words, a trial balloon answers questions that a politician wants to know BEFORE implementing a policy so that the policy or law can be tweaked to be successfully put in place.
In research, a pilot study is sort of like a trial balloon. It is “a small-scale test of the methods and procedures” of a planned full-scale study (Porta, Dictionary of Epidemiology, 5th edition, 2008). A pilot study answers questions that we want to know BEFORE doing a larger study, so that we can tweak the study plan and have a successful full-scale research project. A pilot study does NOT answer research questions or hypotheses,such as “Does this intervention work?” Insteada pilot study answers the question “Are these research procedures workable?”
A pilot study asks & answers: “Can I recruit my target population? Can the treatments be delivered per protocol? Are study conditions acceptable to participants?” and so on. A pilot study should have specific measurable benchmarks for feasibility testing. For example if the pilot is finding out whether subjects will adhere to the study, then adherence might be defined as “70 percent of participants in each [group] will attend at least 8 of 12 scheduled group sessions.” Sample size is based on practical criteria such as budget, participant flow, and the number needed to answer feasibility questions (ie. questions about whether the study is workable).
A pilot study does NOT: Test hypotheses (even preliminarily); Use inferential statistics; Assess safety of a treatment; Estimate effect size; Demonstrate safety of an intervention.