In the first installment, I talked about the difference between aptitude assessments and psychometric assessments, and the further division of psychometric assessments into “type-based”, such as MBTI or DiSC, and “trait-based”, and how they are different. I pointed out that while “type-based” assessments are useful for understanding how people tend to react, and as such make good tools for analyzing team dynamics or bringing greater self-awareness, it is the predictive nature and the focus on competencies of “trait-based” tools that make them ideal for selection and developing talent at all levels of the organization.
In this installment, I will outline some key issues to think about when considering using “trait-based” assessments and give some specific examples of how these tools can be used.
This should be first question on any HR or decision-maker’s mind when looking at assessments. In reality this is actually two questions: what is the ‘reliability’ and what is the ‘validity’ of the assessment. Reliability relates to how precisely and error free a tool is in measuring what it wants to measure. The most easily understood concept of reliability is “test-retest” (there are others such as “internal consistency”), that is, will a person get roughly the same results if they retake the test within a short period of time. Any decent psychometric assessment, both type-based or trait-based, should be able to provide a reliability score (and methodology) between 0 and 1, the higher the better . Of course, it needs to be remembered that, unlike aptitude assessments where the questions have correct answers, psychometric assessments are self-assessments, and so reliability scores for even the best assessments will usually be in a range of 0.7 to 0.9.
While reliability is important, the second question about validity is actually more important. Validity measures the extent to which an assessment measures what it is designed to measure. Like reliability, validity can be broken down into a variety of aspects, but I will focus on the most important considerations for HR or decision-makers.
The first issue then is actually what is being measured. For type-based assessments, this is a person’s ‘tendency’ or personal preference. As mentioned before, this is great for areas such as team-building, but personal preference does not indicate job performance, or potential job performance . However for well designed trait-based assessments, there should always be a validation score to show correlation between the results of the assessment and assessment of a person’s subsequent work place performance. For example, the validity results for work place performance of the OPQ32 assessment are given as around 0.3 .
Now, you may think that, for predicting work place performance, 0.3 is not a high score, but to give some context, an interview has a validity of around 0.2 . So if, for example, you were screening candidates for leadership development or an overseas assignment, using a psychometric assessment in addition to the interview process is already adding value. Not only is it better at identifying who is suitable and who less so, it is also able to identifying specific development areas for each candidate.
For example, there is one European company in Japan that uses psychometric assessments as an independent check on their new graduate and mid-career hiring process. If there is a mismatch between result from the interviewer and the assessment, this can be picked up in a second interview.
There are many factors that affect validity, from the fundamental construction of the questions themselves (e.g. are they poorly written or culturally biased?) to the correlation between assessment results and actual workplace performance, and a good assessment developer will be able to outline specifically what steps they have taken to increase validity.
One example of increasing validity is the innovative approach taken by Saville Consulting with their Wave assessments, where they measure both ‘talent’ (“I am good at…”) and ‘motive’ or motivation (“I like doing…”) in the same assessment, and also their use of both rating (so called “normative”) and ranking (“ipsative”) style questions as the example below shows:
(Above:- Rating or ‘normative’ style of questionnaire has several advantages: it is quick, the assessees are free to chose and the responses are not interrelated)
In 2007/8, there was a comparison study, called Project Epsom, with 308 participants each taking up to 29 different assessments from a variety of providers. Those results were published in a paper called “How valid is your questionnaire?”
The main results are in the chart below. The names of the assessments may not be familiar to you, but they all have validities above 0.3 and so all add value. It is worth mentioning that for both MBTI and DiSC, the two major type-based questionnaires, there was little validity in predicting work place performance, so they are not on the chart.
One final point relating to validity is about the number of people who have taken the assessment. While all assessments will use a “norm” or comparison group in their results, it is important to realize that the number of people who have taken the assessment has absolutely no relation to the validity of the assessment. It is not true that an assessment becomes more valid the higher the number of people who have taken it. If the developers of the assessment have not run a trial with a statistically significant number of participants and a clear methodology to be able to statistically compare the assessment results with actual work place performance, then are not able to provide a validity score.
There are several other points to consider when choosing to use psychometric assessments.
Time for completion
One reason why companies chose to use assessments is they are a quick alternative to what could be a lengthy, and possibly expensive process. This is particularly true when the pool of candidates is large, for example the hiring process. However, it is worth bearing in mind that some assessments can take up to one hour to complete. If you have assessments with similar validity, why would you choose the longer one?
Return on investment
Some psychometric assessment companies require a yearly fee in addition to the cost of each assessment. For others you pay only for each participant or for each report generated. It is important to think about your needs and to calculate the yearly total cost. For most companies, flexibility is probably key, as they start to use assessments on a small scale, gradually building up more widespread use throughout the company. Of course, the higher the validity, the greater the return in terms of better selection or identification of both people and development areas.
Integration with existing human resource management systems
Most providers are able to integrate with a variety HRM systems. Even if you do not have your own platform, but are purchasing individual assessments, most providers should be able to extract data for internal company use.
It is both ethically correct and important to give feedback to the assessees. Depending on the type of assessment and the reason for its use, feedback may be individual or in groups. This process needs to be taken into consideration when deciding assessments.
Cultural fit with an organization
Some assessments will also provide a profile of ‘cultural fit’, in other words, a profile of what kind of organizational or work culture the assessee prefers. I have noticed that many HR professionals are extremely interested in cultural fit because of the impact that it can have on work performance.
As a final point, one thing that continually surprises me is the number of different and innovative ways that companies can use assessments in selecting, recruiting, talent development and in increasing motivation among employees. A good assessment, used in conjunction with existing systems, can provide a clear benefit to an organization.
 For example the test-retest average reliability score for the Saville Consulting psychometric tools is 0.79 (1153 participants, 2-month interval)  As an example see part 1 of this article regarding how the Myers-Briggs Foundation say it is unethical to use MBTI for screening or recruitment purposes.  SHL, OPQ32 User Manual supplement, chapter 8 (https://www.shlsolutionpartner.com/au/resources/NEWOPQ32TechManualsupplement.pdf)  A well constucted ability test, as used in an assessment centre for example, has a validity of around 0.5.? Validities of +0.7 are virtually unknown in the research literature.  Freely available online, for example from (https://www.y2cp.com/ressources/publications/articles/tests/wave/how_valid_is_your_questionnaire.pdf)
This post is also available in: Japanese