Tag Archives: Research instruments

Construct Validity: Taking it to the next level

Collecting data is tricky. Data collection tools, like questionnaires, measure research study outcomes more or less well. A tool’s level of validity is how comprehensively & accurately the tool measures what it is supposed to measure (like stress, hope, etc); and reliabilityis how consistently it measures what it is supposed to measure. We’ve all had the experience of a weight-measuring bathroom scale breaking bad and changing our weight each time we step on it. That scale has validity, but no reliability. (See earlier post On Target all the time and everytime” )

Tools are more or less reliable & valid; none are perfect.

Validity is like hitting the right outcome target, and there are four (4) types of validity: 1) Face validity, 2) Content validity, 3) Construct validity, & 4) Criterion-related validity. Earlier posts focused on face & content validity as linked above. This blog focuses on #3: construct validity.

Construct validity is the level of tool accuracy & can be established by these statistical measures: a) convergent validity, b) discriminant/divergent validity, c) known groups, and d) factor analysis. For each of these, subjects (Ss) complete the measurement tool, & results are analyzed.

To illustrate, let’s assume we have a new pain data collection tool. In convergent validity, the same group of Ss complete the new pain tool and an already established pain tool (like self-report on 1-10 scale). Convergent construct validity exists when there is a positive correlation between the results from both tools. Scores on both tools should be similar for convergent validity.

For discriminant (or divergent) validity, a single group of Ss complete the new pain tool and an established tool that measure the “opposite” or a dissimilar concept, such as feeling comfortable. Divergent validity of the new tool is revealed when there is no or low correlation between results from these 2 dissimilar tools. That’s a good thing! We should expect a big difference because the tools are measuring very different things. Pain & feeling comfortable should be very different in the same person at the same time for divergent validity.

Known groups validity means that we compare scores from subjects who exhibit & from those who do NOT exhibit what our tool is supposed to measure. For example, a group in pain and a group who are known to be in complete Zen and NOT to be in pain may fill out the new pain tool. Scores of those in pain should obviously be very different from scores of the “Zenned-out” group who is NOT in pain. Scores of the two groups should have an inverse, no, or low correlation. If the two groups average scores are compared (using ANOVA or t-test) those means should be very different. These differences between groups = known group construct validity.

Photo by Karolina Grabowska http://www.kaboompics.com on Pexels.com

Finally, a single group of subjects (Ss) may complete the instrument, and statistical factor analysis testing is done. Factor analysis arranges items into groups of similar items. The researcher examines each group of items (a factor) and labels it. In our fictitious pain tool example, factor analysis may group items into three (3) main factors that the researcher labels as “physical aspects of pain,” “psychological aspects of pain,” and “disruption of relationships.”

FOR MORE INFO: Check out Highfield, M.E.F. (2025). Select Data Collection Tool. In: Doing Research. Springer, Cham. https://doi.org/10.1007/978-3-031-79044-7_8

CRITICAL THINKING EXERCISE: Read this Google AI overview to test our your renewed/new knowledge of construct validity. See any now familiar ideas?

Pain scale construct validity is established when instruments (e.g., VAS, NRS, FPS-R) accurately measure the theoretical, multi-dimensional concept of pain—intensity, affect, and interference—rather than just a physical sensation. Evidence shows strong convergence between these tools (r=0.82–0.95), confirming they measure similar constructs. 

Convergent Validity: High correlations exist between different, established pain scales (e.g., Numerical Rating Scale (NRS) and Visual Analogue Scale (VAS), indicating they measure the same construct.

Discriminant Validity: Pain scales show lower, non-significant correlations with unrelated variables (e.g., age, irrelevant behavioral factors), proving they specifically measure pain, not general distress.

Dimensionality: Construct validity in tools like the Brief Pain Inventory (BPI) is confirmed through factor analysis, which differentiates between pain intensity and pain interference.

Face Validity: Judging a book by its cover

“Don’t judge a book by its cover.” That’s good advice about not evaluating persons merely by the way they look to you. I suggest we all take it.

But…when it comes to evaluating data collection tools, things are different. When we ask the question, “Does this questionnaire, interview, or measurement instrument look like it measures what it is supposed to measure, then we are legitimately judging a book (instrument) by its cover (appearance). We call that judgment face validity. In other words, the tool appears to us on its face to measure what it is designed to measure.

For example, items on the well-established Beck Depression Inventory (DPI) cover a range of symptoms, such as sadness, pessimism, feelings of failure, loss of pleasure, guilt, crying, and so on. If you read all DPI items, you could reasonably conclude just by looking at them that those items do indeed measure depression. That judgement is made without the benefit of statistics, and thus you are judging that book (the DPI) by its cover (how it appears to you). That is face validity.

Face validity is only one of four types of data collection tool validity.

In research, tool validity is defined as how well a research tool measures what it is designed to measure. The four broad types of validity are: a) face, b) content, c) construct, and d) criterion-related validity. And make no mistake, face validity is the weakest of the four. Nonetheless, it makes a good a starting point. Just don’t stop there; you will need one or more of its three statistical validity cousins–content, construct, and criterion-related–to have a strong data collection tool.

And…in referring back to the DPI example….the DPI looks valid probably because it is verified as valid by other types of validity

Thots about why we need face validity at all?

New book: “Doing Research: A Practical Guide”

Author: Martha “Marty” E. Farrar Highfield

NOW AVAILABLE ELECTRONICALLY & SOON IN PRINT.

CHECK OUT: https://link.springer.com/book/10.1007/978-3-031-79044-7

This book provides a step-by-step summary of how to do clinical research. It explains what research is and isn’t, where to begin and end, and the meaning of key terms. A project planning worksheet is included and can be used as readers work their way through the book in developing a research protocol. The purpose of this book is to empower curious clinicians who want data-based answers.

Doing Research is a concise, user-friendly guide to conducting research, rather than a comprehensive research text. The book contains 12 main chapters followed by the protocol worksheet. Chapter 1 offers a dozen tips to get started, Chapter 2 defines research, and Chapters 3-9 focus on planning. Chapters 10-12 then guide readers through challenges of conducting a study, getting answers from the data, and disseminating results. Useful key points, tips, and alerts are strewn throughout the book to advise and encourage readers.