Valid or reliable – take your pick
Last week we had an interesting conversation. It started as a discussion about projects, particularly how you make sure you projects are going in the right direction. Ideally, you should choose how you’re going to be measuring yourself before, during, and after the project. If you wait to the end to say “how do we determine if we were successful?” we most certainly design the criteria in a way that helps us look successful. For this reason, and others, we should determine that success criteria in advance.
However, what makes for good criteria in messy problem statements? I’m talking about things such as a behavioral shift or knowledge and skill building. These are not easy things to measure and evaluate.
I am often critical of companies doing what is easy, instead of what is right. Let’s explore that further.
There is often a trade-off when determining metrics between designing something that is reliable as a measure versus something that is highly valid. As an example, if you are rolling out some new training for employees, you could test them at the end of the class to see if they absorbed the information. This is a highly reliable test. You can get 100 percent participation, the test is inexpensive and therefore never compromised or cut, the comparison from one class to the other is consistent. It is a very reliable way to accomplish the task.
However, it is not very valid. Who cares if they remember the information IN the class? What I care about is what happens after that, and I care about their ability to apply it. It is much more valid to observe the person in their environment, ideally over multiple time periods, to see how they have integrated not just the information but the skill into their work. That is the most valid way to test the outcome of the class.
But this is not perfect. This kind of test is not as comparable, because the observations are not consistent, might be done by different people, and is certainly done under different conditions. It is also very expensive to spend that time just observing people’s time, so it is subject to shortcuts and being cut from the process entirely.
This is the tradeoff. This is not easy. To find a measure that is highly reliable but also highly valid is a challenge. Don’t just do what’s easy. Think about what is important to you about your measurement and then design the appropriate solution.