Criterion Validity: Definition & Examples

Key Takeaways

  • Criterion validity (or criterion-related validity) examines how well a measurement tool corresponds to other established and valid measures of the same concept.
  • It includes concurrent validity (existing criteria) and predictive validity (future outcomes).
  • Criterion validity is important because, without it, tests would not be able to accurately measure in a way consistent with other validated instruments.
criterion validity

Criterion validity assesses how well a test predicts or relates to a specific outcome or criterion. It includes concurrent validity (correlation with existing measures) and predictive validity (predicting future outcomes).

This approach emphasizes practical applications and focuses on demonstrating that the test scores are useful for predicting or estimating a particular outcome.

For example, when measuring depression with a self-report inventory, a researcher can establish criterion validity if scores on the measure correlate with external indicators of depression such as clinician ratings, number of missed work days, or length of hospital stay.

Types of Criterion Validity

Criterion validity is often divided into subtypes based on the timing of the criterion measurement:

  • Concurrent validity: This examines the relationship between a test score and a criterion measured at the same time.
  • Predictive validity: This assesses the relationship between a test score and a criterion measured in the future.

Predictive

Predictive validity demonstrates that a test score can predict future performance on another criterion (Cohen & Swerdik, 2005).

Good predictive validity is important when choosing measures for employment or educational purposes, as it increases the likelihood of selecting individuals who will perform well.

Predictive criterion validity is established by demonstrating that a measure correlates with an external criterion measured at a later point in time.

The correlation between scores on standardized tests like the SAT or ACT and a student’s first-year GPA is often used as evidence for the predictive validity of these tests.

These tests aim to predict future academic performance, and a strong positive correlation between test scores and subsequent GPA would support their ability to do so.

Concurrent 

Concurrent criterion validity is established by demonstrating that a measure correlates with an external criterion assessed simultaneously.

This can be shown when scores on a new test correlate highly with scores on an established test measuring similar constructs (Barrett et al., 1981).

This approach is valuable for:

  1. Measuring similar but not perfectly overlapping constructs, where the new measure should predict variance beyond existing measures
  2. Evaluating practical outcomes rather than theoretical constructs (Barrett et al., 1981)

While correlational analyses are most common, researchers may also use regression.

Validation methods include comparing responses between new and established measures given to the same group, or comparing responses to expert judgments (Fink, 2010).

Note that concurrent validity does not guarantee predictive validity.

How to measure criterion validity

Identify a well-established, validated measure (criterion) that assesses the same construct as the new measure you want to validate.

This criterion measure should have demonstrated reliability and validity, serving as a benchmark for comparison.

Establishing Concurrent Validity:

  • Concurrent validity is a type of criterion-related validity that evaluates how well a new measure correlates with an existing, well-established measure or criterion that assesses the same construct.
  • To establish concurrent validity, you would administer the new measurement technique and the established criterion measure to the same group of participants at approximately the same time.
  • Ensure that the criterion measure is assessed independently of the test scores. If knowledge of test scores influences the criterion assessment, an artificially inflated correlation can occur, leading to an overestimation of the test’s criterion validity.
  • Then, you would statistically analyze the relationship between the scores obtained from the new technique and the scores from the established criterion.
  • This is typically done using correlation coefficients, such as Pearson’s correlation coefficient (for continuous data) or Spearman’s rank correlation coefficient (for ordinal data).
  • A strong, positive correlation between the new technique and the established criterion would indicate good concurrent validity, suggesting that the new technique measures the same construct as the established one.

Establishing Predictive Validity:

  • Predictive validity is another type of criterion-related validity that assesses how well a measure can predict future performance or outcomes.
  • To establish predictive validity, you would administer the new measurement technique to a group of participants and then wait for a specified period (e.g., several months or years) to assess their performance or outcomes on a relevant criterion.
  • Identify and control for extraneous variables that might influence the relationship between the test scores and the criterion.
  • For example, in a study investigating the predictive validity of a college admissions test, factors such as socioeconomic background, prior academic preparation, and motivation could all potentially influence college GPA (the criterion).
  • Statistically controlling for these variables can help isolate the specific contribution of the test scores to the criterion variance.
  • You would then statistically analyze the relationship between the scores obtained from the new technique and the future performance or outcomes. Again, correlation coefficients are commonly used for this purpose.
  • A strong, positive correlation between the new technique scores and the future performance would indicate good predictive validity, suggesting that the new technique can accurately predict future outcomes.

Examples of criterion-related validity

Intelligent tests

Researchers developing a new, shorter intelligence test might administer it concurrently alongside a well-established test, such as the Stanford-Binet.

If there is a high correlation between the scores from the two tests, it suggests the new test measures the same construct (intelligence), supporting its concurrent validity.

Risk assessment and dental treatment

Bader et al. (2005) studied the predictive validity of a subjective method for dentists to assess patients’ caries risk.

They analyzed data from practices that had used this method for several years to see if the risk categorization predicted the subsequent need for caries-related treatment.

Their findings showed that patients categorized as high-risk were four times more likely to receive treatment than those categorized as low-risk, while those categorized as moderate-risk were twice as likely.

This supports the predictive validity of this assessment method.

Minnesota Multiphasic Personality Inventory

The initial validation of the MMPI involved identifying items that differentiated between individuals with specific psychiatric diagnoses and those without, contributing to the development of scales for various psychopathologies.

This method of establishing validity, where the test is compared to an existing criterion measured at the same time, exemplifies concurrent validity.

FAQs

What is the difference between criterion and construct validity?

Criterion validity examines the relationship between test scores and a specific external criterion the test aims to measure or predict.

This criterion is a separate, independent measure of the construct of interest.

This approach emphasizes practical applications and focuses on demonstrating that the test scores are useful for predicting or estimating a particular outcome.

Construct validity seeks to establish whether the test actually measures the underlying psychological construct it is designed to measure.

It goes beyond simply predicting a criterion and aims to understand the test’s theoretical meaning.

How do you increase criterion validity?

There are several ways to increase criterion validity, including (Fink, 2010):

– Making sure the content of the test is representative of what will be measured in the future
– Using well-validated measures
– Ensuring good test-taking conditions
– Training raters to be consistent in their scoring

References

Aboraya, A., France, C., Young, J., Curci, K., & LePage, J. (2005). The validity of psychiatric diagnosis revisited: the clinician’s guide to improve the validity of psychiatric diagnosis. Psychiatry (Edgmont), 2(9), 48.

Bader, J. D., Perrin, N. A., Maupomé, G., Rindal, B., & Rush, W. A. (2005). Validation of a simple approach to caries risk assessment. Journal of public health dentistry65(2), 76-81.

Barrett, G. V., Phillips, J. S., & Alexander, R. A. (1981). Concurrent and predictive validity designs: A critical reanalysis. Journal of Applied Psychology, 66(1), 1.

Conte, J. M. (2005). A review and critique of emotional intelligence measures. Journal of Organizational Behavior, 26(4), 433-440.

Fink, A. Survey Research Methods. In McCulloch, G., & Crook, D. (2010). The Routledge international encyclopedia of education. Routledge.

Prince, M. Epidemiology. In Wright, P., Stern, J., & Phelan, M. (Eds.). (2012). Core Psychiatry EBook. Elsevier Health Sciences.

Schmidt, F. L. (2012). Cognitive tests used in selection can have content validity as well as criterion validity: A broader research review and implications for practice. International Journal of Selection and Assessment, 20(1), 1-13.

Swerdlik, M. E., & Cohen, R. J. (2005). Psychological testing and assessment: An introduction to tests and measurement.

Saul McLeod, PhD

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Editor-in-Chief for Simply Psychology

Saul McLeod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.


Olivia Guy-Evans, MSc

BSc (Hons) Psychology, MSc Psychology of Education

Associate Editor for Simply Psychology

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

Charlotte Nickerson

Research Assistant at Harvard University

Undergraduate at Harvard University

Charlotte Nickerson is a student at Harvard University obsessed with the intersection of mental health, productivity, and design.

h4 { font-weight: bold; } h1 { font-size: 40px; } h5 { font-weight: bold; } .mv-ad-box * { display: none !important; } .content-unmask .mv-ad-box { display:none; } #printfriendly { line-height: 1.7; } #printfriendly #pf-title { font-size: 40px; }