The Science Behind Assessments

10 min

Assessments at Work

How we assess and evaluate talent ultimately determines how we select, allocate, and operationalize talent. I think it’s overwhelmingly the case that talent is underselected, misallocated, and inefficiently operationalized. Much of this can be attributed to the previously recognized imperfect solutions most companies deploy to screen and signal talent, namely, resumes and interviews. While valuable in their own right, these screening methods can be incredibly subjective, data absent, and unreliable for a few reasons: These processes tend to be incredibly inconsistent across multiple candidates, there are piles of studies that shows that we as humans are terrible evaluators of other people, most candidates aren’t necessarily presenting an authentic version of themselves, and most objectivity gets thrown out the window.

Keeping with the analogy used above, just as a football coach should watch players play football in order to determine their skill level and make mission critical talent decisions, organizations should see how prospective candidates perform on job relevant tasks before making mission critical talent decisions. This is where assessments come in handy.

The purpose of deploying assessments to screen and signal talent is ultimately obtain more objective, useful, and data driven information that can help recruiters and managers make better quality and more informed selection decisions. Assessments present these parties with the opportunity to use information they might not have otherwise gotten by relying on those traditional evaluation methods alone. Although imperfect in their own right, assessments offer an incredibly viable option to gather accurate information about an individual's job relevant characteristics.

Assessment Considerations

The most important subsequent factor for organizations to consider before using assessments that influence selection decisions is to ensure that the assessment is of high quality and compliant with relevant guidelines, laws, and standards around assessment development. This includes, but is not exclusive to, evidence of assessment validity, reliability, relevance, and fairness. In the event that an employer becomes subject to an evaluation of their assessment and evaluation methods (a meta evaluation, some might call it) and they are unable to produce sufficient evidence around those factors, they put themselves at significant litigation risk without a basis for legal defensibility. Below, we provide a brief description of some of these important assessment characteristics.

Validity and Validation

The validity of an assessment must be evaluated with its intended purpose. Stated differently, validity is a measure of effectiveness that gives meaning to the test scores in accordance to some theoretical framework. Validation is the process through which the proposed interpretation of test scores is interpreted. Validation is the means, validity is the ends.

When an assessment has demonstrated evidence of validity, this indicates that there is a linkage between test performance and job performance. This means that someone who does well on the assessment is more likely to perform successfully on the job, and vice versa. For example, if a candidate were applying for a job as a speech writer, you might expect an assessment requiring the candidate to write a speech to be valid.

There are generally three types of validity for organizations to consider as sufficient validity evidence:

These strategies are used to establish assessment validity in their own unique ways. In practice, there is no one size fits all validation strategy that tops out all the others. The strategy or group of strategies that an assessment provider pursues is dependent on various factors, some of which include the type of job, the skills being measured, and the resources available.

Content Validity

Establishing content validity involves systematically tying the contents of the assessment in question to important job related behaviors or that the selection procedure provides a representative sample of the work product in the job. This involves a demonstration that the content of the assessment is representative of the knowledge, skills and abilities required to be successful in that job. A prerequisite to establishing content validity is an in depth job analysis that outlines what those important job related behaviors are. This includes going through each item of the assessment to determine its essentiality and contribution to that aim.

Criterion validity?

Criterion validation is another strategy used to establish the validity of an assessment. To establish criterion validity, the assessment provider is required to demonstrate a correlative relationship between performance on the assessment and performance on the job. In theory, if an assessment is proved criterion valid, then those who score high on the test will presumably tend to perform better on the job, and vice versa. The aim, then, is to establish that relationship between the predictors and the selected criterion.

Criterion validity comes in two forms: predictive and concurrent. The distinction between the two is determined by the amount of time that has elapsed between drawing a correlation between assessment performance and the selected criterion. In predictive validity procedures, the data from the selection procedure are typically collected at the time the individuals are selected, and after a specified period of time (usually a few months), the criterion data are collected. In the concurrent validity approach, the assessment performance data and the criterion data are generally collected at around the same time from an incumbent employee base.

Construct Validity

A construct simply represents something that you are trying to measure. Generally speaking, constructs cannot be directly seen or heard, but the effects produced by these constructs can be observed on other behaviors. For example, things like intelligence and personality would be considered constructs, because you can’t really see these things, but you can observe their effects. Construct Validity refers to the extent to which an assessment correlates with a theorized scientific construct, which requires an empirical demonstration that the test measures the construct it claims to measure.


An assessments reliability refers to the consistency at which an assessment measures what it claims to measure. Drawing an analogy, if I were to step on a weight scale and it told me that I weigh X pounds on Tuesday and X + 50 pounds on Wednesday, that likely isn’t a very reliable scale (unless Taco Tuesday was extra good). Reliability is seen as a prerequisite to validity, it would be difficult to truly understand how much I weigh (that is, to have a valid weight scale) if the scale told me a different weight each day. However, an assessment can certainly be reliable without being valid: It doesn’t matter if a weight scale consistently tells me I weigh X + 50 pounds if my true weight is just X.

How to Assess Reliability?

There are a few ways in which assessment providers can assess the reliability of their assessment, and the method by which they do so is generally dependent on the resources they have available to them at the time of their study. The two main forms of assessing reliability that we focus on are:


Although validity and reliability are not the only scientific measures to consider when thinking about using assessments for screening and selection purposes within organizations, they are certainly two of the most important factors to consider. In the event that employers are unable to demonstrate the validity and reliability of their selection procedures, they open themselves up to litigation risk by way of Equal Employment Opportunity standards and guidelines.

What are
Non Technical Skills?

A take on Human Skills, and their role in the workplace

The Science Behind Assessments

Robots, people, and energy drinks

Hire People Better

Help yourself, and your employees