Category Archives: Reliability & validity

Testing the Test (or an intro to “Does the measurement measure up?”)

When reading a research article, you may be tempted only to read the Introduction & Background, then go straight to the Discussion, Implications, and Conclusions at the end. You skip all those pesky, procedures, numbers, and p levels in the Methods & Results sections.

Perhaps you are intimidated by all those “research-y” words like content validity, construct validity, test-retest reliability, and Cronbach’s alpha because they just aren’t part of your vocabulary….YET!

WHY should you care about those terms, you ask? Well…let’s start with an example. If your bathroom scale erratically measured your weight each a.m., you probably would toss it and find a more reliable and valid bathroom scale. The quality of the data from that old bathroom scale would be useless in learning how much you weighed. Similarly in research, the researcher wants useful outcome data. And to get that quality data the person must collect it with a measurement instrument that consistently (reliably) measures what it claims to measure (validity). A good research instrument is reliable and valid. So is a good bathroom scale.

Let’s start super-basic: Researchers collect data to answer their research question using an instrument. That test or tool might be a written questionnaire, interview questions, an EKG machine, an observation checklist, or something else. And whatever instrument the researcher uses needs to give them correct data answers.

For example, if I want to collect BP data to find out how a new med is working, I need a BP cuff that collects systolic and diastolic BP without a lot of artifacts or interference. That accuracy in measuring BP only is called instrument validity. Then if I take your BP 3 times in a row, I should get basically the same answer and that consistency is called instrument reliability. I must also use the cuff as intended–correct cuff size and placement–in order to get quality data that reflects the subject’s actual BP.

The same thing is true with questionnaires or other measurement tools. A researcher must use an instrument for the intended purpose and in the correct way. For example, a good stress scale should give me accurate data about a person’s stress level (not their pain, depression, or anxiety)–in other words it should have instrument validity. It should measure stress without a lot of artifacts or interference from other states of mind.

NO instrument is 100% valid–it’s a matter of degree. To the extent that a stress scale measures stress, it is valid. To the extent that it also measures other things besides stress–and it will–it is less valid. The question you should ask is, “How valid is the instrument?” often on a 0 to 1 scale with 1 being unachievable perfection. The same issue and question applies to reliability.

Reliability & validity are interdependent. An instrument that yields inconsistent results under the same circumstances cannot be valid (accurate). Or, an instrument can consistently (reliably) measure the wrong thing–that is, it can measure something other than what the researcher intended to measure. Research instruments need both strong reliability AND validity to be most useful; they need to measure the outcome variable of interest consistently.

Valid for a specific purpose: Researchers must also use measurement instruments as intended. First, instruments are often validated for use with a particular population; they may not be valid for measuring the same variable in other populations. For example, different cultures, genders, professions, and ages may respond differently to the same question. Second, instruments may be valid in predicting certain outcomes (e.g., SAT & ACT have higher validity in predicting NCLEX success than does GPA). As Sullivan (2011) wrote: “Determining validity can be viewed as constructing an evidence-based argument regarding how well a tool measures what it is supposed to do. Evidence can be assembled to support, or not support, a specific use of the assessment tool.”

In summary….

  1. Instrument validity = how accurate the tool is in measuring a particular variable
  2. Instrument reliability = how consistently the tool measures whatever it measures

Fun Practice: In your own words relate the following article excerpt to the concept of validity? “To assess content validity [of the Moral Distress Scale], 10 nurses were asked to provide comments on grammar, use of appropriate words, proper placement of phrases, and appropriate scoring. From p.3, Ghafouri et al. (2021). Psychometrics of the moral distress scale in Iranian mental health nurses. BMC Nursing. https://doi.org/10.1186/s12912-021-00674-4

On Target all the time and everytime !

“Measure twice. Cut once!” goes the old carpenter adage. Why? Because measuring accurately means you’ll get the outcomes you want!

Same in research. A consistent and accurate measurement will get you the outcomes you want to know. Whether an instrument measures something consistently is called reliability. Whether it measures accurately is called validity. So, before you use a tool, check for its reported reliability and validity.

A good resource for understanding the concepts of reliability (consistency) and validity (accuracy) of research tools is at https://opentextbc.ca/researchmethods/chapter/reliability-and-validity-of-measurement/ Below are quoted Key Takeaways:

  • Psychological researchers do not simply assume that their measures work. Instead, they conduct research to show that they work. If they cannot show that they work, they stop using them.
  • There are two distinct criteria by which researchers evaluate their measures: reliability and validity. Reliability is consistency across time (test-retest reliability), across items (internal consistency), and across researchers (interrater reliability). Validity is the extent to which the scores actually represent the variable they are intended to.
  • Validity is a judgment based on various types of evidence. The relevant evidence includes the measure’s reliability, whether it covers the construct of interest, and whether the scores it produces are correlated with other variables they are expected to be correlated with and not correlated with variables that are conceptually distinct.
  • The reliability and validity of a measure is not established by any single study but by the pattern of results across multiple studies. The assessment of reliability and validity is an ongoing process.

Research Words of the Week: Reliability & Validity

Reliability & validity are terms that refer to the consistency and accuracy of a quantitative measurement questionnaire, technical device, ruler, or any other measuring device.  It means that the outcome measure can be trusted and is relatively error free.

  • Reliability This means that the instrument measures CONSISTENTLY
  • Validity – This means that the instrument measures ACCURATELY. In other words it measures what it is supposed to measure and not something else.

For example: If your bathroom scale measures weight, then it is a valid measure of weight (e.g. it doesn’t measure BP or stress). You might say it had high validity. If your bathroom scale measures your weight as the same thing when you step on and off of it several times then it is measuring weight reliably  or consistently; and you might say it has high reliability.

“Please answer!” – How to increase the odds in your favor when it comes to questionnaires

Self-report by participants is one of the most common ways that researchers collect data, yet it is fraught with problems.   Some worries for researchers are: “Will participants be honest or will they say what they think I want to hear?”   “Will they understand the DifferentGroupsquestions correctly?”  “Will those who respond (as opposed to those who don’t respond) have unique ways of thinking so that my respondents do not represent everyone well?” and a BIG worry “Will they even fill out and return the questionnaire?”

One way to solve at least the latter 2 problems is to increase the response rate, and Edwards et al (2009 July 8) reviewed randomized trials  to learn how to do just that!!Questionnaire faces

If you want to improve your questionnaire response rates, check it out!  Here is Edwards et al.’s plain language summary as published in Cochrane Database of Systematic Reviews, where you can read the entire report.

Methods to increase response to postal and electronic questionnaires

MailPostal and electronic questionnaires are a relatively inexpensive way to collect information from people for research purposes. If people do not reply (so called ‘non-responders’), the research results will tend to be less accurate. This systematic review found several ways to increase response. People can be contacted before they are sent a postal questionnaire. Postal questionnaires can be sent by first class post or recorded delivery, and a stamped-return envelope can be provided. Questionnaires, letters and e-mails can be made more personal, and preferably kept short. Incentives can be offered, for example, a small amount of money with Remember jpga postal questionnaire. One or more reminders can be sent with a copy of the questionnaire to people who do not reply.

 

Critical/reflective thinking:  Imagine that you were asked to participate in a survey.  Which of these strategies do you think would motivate or remind you to respond and why?

For more info read the full report: Methods to increase response to postal and electronic questionnaires

 

Consistency wins! High reliability= Zero harm

priority“What’s important is not where an organization begins its patient safety journey, but instead the degree to which it exhibits a relentless commitment to improvement.” – TJC, 2016, p.68

The path to zero harm, according to TJC, begins with high reliability.   Reliability in research = consistency.  TJC says for zero harm we as providers must be consistent in these ways:

  • Never be satisfied with your safety record. Always be alert for danger
  • Be alert for early signs of potential danger. Don’t oversimplify your observations
  • Note small changes in the organization as having longer range or unintended effects
  • Commit to resilience so that when errors do happen, you bounce back quickly
  • When confronted by a threat, put its resolution in the hands of those with the most expertise in that area

Using evidence in practice can be part of our “relentless commitment to improvement,” especially when coupled with above 5 actions and can support zero harm to patients.   That evidence can be from research, from process improvement, from evaluation of clinical innovations, or from experts.

For more read TJC’s High Reliability: The Path to Zero Harm online at http://www.jointcommission.org/assets/1/18/HC_Exec_article.pdf  

Making research accessible to RNs

%d bloggers like this: