Construct Validity: Taking it to the next level

Collecting data is tricky. Data collection tools, like questionnaires, measure research study outcomes more or less well. A tool’s level of validity is how comprehensively & accurately the tool measures what it is supposed to measure (like stress, hope, etc); and reliabilityis how consistently it measures what it is supposed to measure. We’ve all had the experience of a weight-measuring bathroom scale breaking bad and changing our weight each time we step on it. That scale has validity, but no reliability. (See earlier post On Target all the time and everytime” )

Tools are more or less reliable & valid; none are perfect.

Validity is like hitting the right outcome target, and there are four (4) types of validity: 1) Face validity, 2) Content validity, 3) Construct validity, & 4) Criterion-related validity. Earlier posts focused on face & content validity as linked above. This blog focuses on #3: construct validity.

Construct validity is the level of tool accuracy & can be established by these statistical measures: a) convergent validity, b) discriminant/divergent validity, c) known groups, and d) factor analysis. For each of these, subjects (Ss) complete the measurement tool, & results are analyzed.

To illustrate, let’s assume we have a new pain data collection tool. In convergent validity, the same group of Ss complete the new pain tool and an already established pain tool (like self-report on 1-10 scale). Convergent construct validity exists when there is a positive correlation between the results from both tools. Scores on both tools should be similar for convergent validity.

For discriminant (or divergent) validity, a single group of Ss complete the new pain tool and an established tool that measure the “opposite” or a dissimilar concept, such as feeling comfortable. Divergent validity of the new tool is revealed when there is no or low correlation between results from these 2 dissimilar tools. That’s a good thing! We should expect a big difference because the tools are measuring very different things. Pain & feeling comfortable should be very different in the same person at the same time for divergent validity.

Known groups validity means that we compare scores from subjects who exhibit & from those who do NOT exhibit what our tool is supposed to measure. For example, a group in pain and a group who are known to be in complete Zen and NOT to be in pain may fill out the new pain tool. Scores of those in pain should obviously be very different from scores of the “Zenned-out” group who is NOT in pain. Scores of the two groups should have an inverse, no, or low correlation. If the two groups average scores are compared (using ANOVA or t-test) those means should be very different. These differences between groups = known group construct validity.

Photo by Karolina Grabowska http://www.kaboompics.com on Pexels.com

Finally, a single group of subjects (Ss) may complete the instrument, and statistical factor analysis testing is done. Factor analysis arranges items into groups of similar items. The researcher examines each group of items (a factor) and labels it. In our fictitious pain tool example, factor analysis may group items into three (3) main factors that the researcher labels as “physical aspects of pain,” “psychological aspects of pain,” and “disruption of relationships.”

FOR MORE INFO: Check out Highfield, M.E.F. (2025). Select Data Collection Tool. In: Doing Research. Springer, Cham. https://doi.org/10.1007/978-3-031-79044-7_8

CRITICAL THINKING EXERCISE: Read this Google AI overview to test our your renewed/new knowledge of construct validity. See any now familiar ideas?

Pain scale construct validity is established when instruments (e.g., VAS, NRS, FPS-R) accurately measure the theoretical, multi-dimensional concept of pain—intensity, affect, and interference—rather than just a physical sensation. Evidence shows strong convergence between these tools (r=0.82–0.95), confirming they measure similar constructs. 

Convergent Validity: High correlations exist between different, established pain scales (e.g., Numerical Rating Scale (NRS) and Visual Analogue Scale (VAS), indicating they measure the same construct.

Discriminant Validity: Pain scales show lower, non-significant correlations with unrelated variables (e.g., age, irrelevant behavioral factors), proving they specifically measure pain, not general distress.

Dimensionality: Construct validity in tools like the Brief Pain Inventory (BPI) is confirmed through factor analysis, which differentiates between pain intensity and pain interference.

“Here comes Santa Claus”: What’s the evidence?

Dec 3, 2025: It’s time once again to examine the evidence. How will you apply it in your Christmas practice?

FULL TEXT ONLINE: Adv Emerg Nurs J. 2011 Oct-Dec;33(4):354-8. doi: 10.1097/TME.0b013e318234ead3. [note: Below is full text excerpt from AENJ summary was published in DYIS blog 16 Dec 2016]

Abstract

The purpose of this article is to examine the strength of evidence regarding our holiday Santa Claus (SC) practices and the opportunities for new descriptive, correlation, or experimental research on SC. Although existing evidence generally supports SC, in the end we may conclude, “the most real things in the world are those that neither children nor men can see” (Church, as cited in Newseum, n.d.).

ARE HOLIDAY Santa Claus (SC) activities evidence based? This is a priority issue for those of us who’ve been nice, not naughty. In this article, I review the strength of current evidence supporting the existence of SC, discuss various applications of that evidence, and suggest new avenues of investigation.

[continue reading at 10.1097/TME.0b013e318234ead3]

Be Kind to Editors and Writers Month

I love this!! To all you readers, writers, editors, and wanna-be readers, wanna-be writers, and wanna-be editors: Happy September 2025!

Here’s a good-reading gift for you all about kindness: Editor & wordsmith Lillie Ammann’s blog.

If you have published letters to the editor, articles, abstracts, posters, or books, let us know in the comments. If you have authored one of these, make sure you put it up on Digital Commons for global access or ResearchGate. And check out others’ work there.

Happy writing & editing! -Dr.H/Marty

Zotero!!

No. Zotero is not a shout like “Cowabunga!” Nor is Zotero the little brother of Zoro, the masked cowboy hero.

ZoteroBib is a free, online tool that quickly formats references into the right style–whether that style be APA, Chicago, or 10,000+ other available formats. Once it generates the formatted reference you can manually correct any mistakes.

To use ZoteroBib go to zbib.org, enter reference info, and select your desired format. You can create an entire bibliography, and then create an editable rtf copy and paste that bibliography into your own paper. ZoteroBib also facilitates insertion of footnotes in the text, including specific reference page #’s for quotes.

You need enter only your reference’s doi or URL or PMID to generate the whole, properly formatted reference. Or you can enter more article/source information.

Here’s a sparkling water toast to well-formatted and well-footnoted papers!

[Special thanks to Librarian Marcia Henry at CSU/Northridge who made me aware of this tool.]

Content Validity: Expert Judgment Required

For accurate study data, you need a tool that correctly & comprehensively measures the outcome of interest (concept). If a tool measures your outcome of interest accurately it has strong validity. If it measures that outcome consistently, it has high reliability.

For now, let’s focus on validity.

Again, validity is how well a research tool measures what it is intended to measure. 

The four (4) types of validity are 1) face, 2) content, 3) construct, & 4) criterion-related. Click here to read my blog on face validity–the weakest type. Now, let’s step it up a notch to content validity.

Content validity is the comprehensiveness of a data collection survey tool. In other words, does the instrument include items that measure all aspects of the thing (concept) you are studying–whether that thing be professional quality of life, drug toxicity, spiritual health, pain, or something else.

When you find a tool that you want to use, look for documented content validity. Content validity means that the tool creators:

  • 1) adopted a specific definition of the concept they want to measure,
  • 2) generated a list of all possible items from a review of literature and/or other sources,
  • 3) gave both their definition and item list to 3-5+ experts on the topic, &
  • 4) asked those experts independently to rate how well each item represents the adopted concept definition (or not). Often experts are asked to evaluate item clarity as well.

When a majority of the expert panel agrees that an item matches the definition, then that item becomes part of the new tool. Items without agreement are tossed. Experts may also edit items or add items to the list, and the tool creator may choose to submit edited and new items to the whole expert panel for evaluation.

Optionally tool creators  may statistically calculate a content validity index (CVI) for items and/or for the tool as a whole, but content validity is still based on experts’ judgment. Some tool authors are just more comfortable with having a number to represent that judgment. An acceptable CVI > 0.78; the “>” means “greater than or equal to.” (Click here for more on item & scale CVIs. )

When reading a research article, you might see content validity reported for the tool. Here’s an example: Content…validity of the nurse and patient [Spiritual Health] Inventories…[was] based on literature review [and] expert panel input….Using a religious-existential needs framework, 59 items for the nurse SHI were identified from the literature with the assistance of a panel of theology and psychology experts…. Parallel patient items were developed, and a series of testing and revisions was completed resulting in two 31-item tools (p. 4, Highfield, 1992).

For more, check out this  quick explanation of content validity: 3 minute YouTube video. If you are trying to establish content validity for your own new tool, consult a mentor and a research text like Polit & Beck’s Nursing research: Generating and assessing evidence for nursing practice.

Critical thinking: What is the difference between face and content validity? How are they alike. (Hint: check out the video.) What other questions do you have?

Face Validity: Judging a book by its cover

“Don’t judge a book by its cover.” That’s good advice about not evaluating persons merely by the way they look to you. I suggest we all take it.

But…when it comes to evaluating data collection tools, things are different. When we ask the question, “Does this questionnaire, interview, or measurement instrument look like it measures what it is supposed to measure, then we are legitimately judging a book (instrument) by its cover (appearance). We call that judgment face validity. In other words, the tool appears to us on its face to measure what it is designed to measure.

For example, items on the well-established Beck Depression Inventory (DPI) cover a range of symptoms, such as sadness, pessimism, feelings of failure, loss of pleasure, guilt, crying, and so on. If you read all DPI items, you could reasonably conclude just by looking at them that those items do indeed measure depression. That judgement is made without the benefit of statistics, and thus you are judging that book (the DPI) by its cover (how it appears to you). That is face validity.

Face validity is only one of four types of data collection tool validity.

In research, tool validity is defined as how well a research tool measures what it is designed to measure. The four broad types of validity are: a) face, b) content, c) construct, and d) criterion-related validity. And make no mistake, face validity is the weakest of the four. Nonetheless, it makes a good a starting point. Just don’t stop there; you will need one or more of its three statistical validity cousins–content, construct, and criterion-related–to have a strong data collection tool.

And…in referring back to the DPI example….the DPI looks valid probably because it is verified as valid by other types of validity

Thots about why we need face validity at all?

Essentials for Clinical Researchers

[note: bonus 20% book discount from publisher. See below flyer]

My 2025 book, Doing Research, is a user-friendly guide, not a comprehensive text. Chapter 1 gives a dozen tips to get started, Chapter 2 defines research, and Chapters 3-9 focus on planning. The remaining Chapters 10-12 guide you through challenges of conducting a study, getting answers from the data, and sharing with others what you learned. Italicized key terms are defined in the glossary, and a bibliography lists additional resources.

Five(5) great AI tools for research: Using without hallucinating

AI is getting better at 1) organizing information & 2) making suggestions for planning and writing research.

1st—a word of warning: Always verify AI-generated content USING YOUR OWN KNOWLEDGE!! Otherwise you’ll likely have AI hallucinations–content that is wrong, deceptive, or just plain nonsense. Scary!

Marek Kiczkowiak (speaker in below video) gives the AI-research-assistant gold medal to SCISPACE . AI SCISPACE bills itself as “The Fastest Research Platform Ever: All-in-one AI tools for students and researchers.” It performs a host of tasks, including creating slides from your paper. Other AI tools, like jenni or ResearchRabbit do some things better or differently. Watch this informative video, & try the tools.

What ethics questions does this raise? Two are: 1) questions of plagiarism (stealing) and 2) questions of how much YOU are learning when being AI-assisted.

Publishers are beginning to ask authors to what extent (if any) AI was used in a submitted paper. Moreoever, caution about plagiarizing is a cheap price for a clean conscience & learning what you need to learn. Hang onto those outcomes. “Above all else, guard your heart, for everything you do flows from it” -Proverb 4:23.

Here’s a second video for some help on avoiding plagiarism.

Your thots?

———————————————————–

Also, check out my 2025 book Doing Research (~100pp) that is written to help make the difficult simple.

[Best place to purchase now is this link: Springer. Amazon is stocking it erratically for reasons mysterious to the publisher.]

Hourly History: A Quick-Read Nursing & Health History Resource

Do you love historical research and wish you could read more?

Check out the 1-hour reads at Hourly History. A short biography of Florence Nightingale is among their reads. You can also sign up for free e-reads.

And remember…nursing & health history never happened (or happen) in a vacuum. Understanding the larger political, medical, geographical, adventurer, ideological, musical, art, philosophical, religious, cultural history (milieu) of any era is important, and these books can give quick insights. For example, Crimean or British leadership during Nightingale’s time will help you understand her and her contributions.

Check out one or more of these books, and let me know what you think. -DrH

Disclaimer: I have no financial or other interest in the Hourly History site or the books it promotes; and I cannot speak to their historical quality.

Theoretically speaking…is this all “pie-in-the-sky” stuff?

Is using theory and conceptual frameworks in studies just “pie-in-the-sky” stuff? Do they have any practical use? Or are they merely for academics in ivory towers?

This blog about theory-testing research1 may affect your answers.

What is it? At its most basic, theory or framework is a set of statements that describe part of reality. Those related statements (called propositions) outline the relationship between two or more ideas (called concepts). One example of a set of propositions is: “Work stress leads to burnout; burnout leads to poor work outcomes; mindfulness practice leads to lower burnout and thus to better work outcomes.” These statements describe the relationships between concepts of “work stress,” “burnout,” “poor work outcomes,” and “mindfulness practice.”

Each concept has 1) an abstract dictionary-type, conceptual definition & 2) a concrete, measurable, operational definition. For example, Maslach conceptually defined burnout as a combination of emotional exhaustion, depersonalization, and lower personal accomplishment; then those concepts are operationally defined as a self-reported burnout score on Maslach’s Burnout Inventory (MBI).

Some theories are named for their authors–like Einstein’s theory of relativity expressed in a single proposition about the relationship between concepts of energy, mass, & speed of light. Einstein’s theory & propositions of other theory/frameworks describe our existing knowledge about a topic based on evidence and logical connections.

To connect your study with such existing knowledge, take these steps:

1) Identify a theory/framework that conceptually & operationally defines your concept of interest and states its relationship to other concepts. Start by looking in the library for articles on your topic.

2) Accept most of the theory/framework’s propositions as true without testing them yourself (called assumptions). All studies assume a lot to be true already–meaning they have a lot of assumptions. It’s the way science works because you can’t test everything at once.

3) Identify a proposition that you want to test, and write it in a testable form as a hypotheses or research questions. You will be testing only a tiny piece of the theory/framework, perhaps by examining the concepts in a new setting, with new methods, or in a different or larger sample. For example, you might want to test an intervention to see if it reduces burnout (e.g., Hypothesis: “ICU staff using a mindfulness phone app will report lower burnout than those who do not use the app.”)

4) When your study is complete, discuss how your findings confirm or disconfirm the theory/framework. Your logic and research are now a part of what we know (or think we know).

Conclusion: Of course there’s much more that could be said on this topic. Let me know what to add in the comments. -Dr.H

Questions for thot:

So, do you think theory/conceptual frameworks are just “pie in the sky” without practical value? If so, how would you build a study on existing knowledge? If you think they ARE practical, how would you use them to study your topic of interest? Explain how you have or have not used propositions in a study.

  1. Theory-building research is a different inductive path. Theory-testing is more deductive. ↩︎

Making research accessible to RNs