Valid or reliable - take your pick

Valid or reliable – take your pick

by Jamie Flinchbaugh on 07-27-10

Last week we had an interesting conversation. It started as a discussion about projects, particularly how you make sure you projects are going in the right direction. Ideally, you should choose how you’re going to be measuring yourself before, during, and after the project. If you wait to the end to say “how do we determine if we were successful?” we most certainly design the criteria in a way that helps us look successful. For this reason, and others, we should determine that success criteria in advance.

However, what makes for good criteria in messy problem statements? I’m talking about things such as a behavioral shift or knowledge and skill building. These are not easy things to measure and evaluate.

I am often critical of companies doing what is easy, instead of what is right. Let’s explore that further.

There is often a trade-off when determining metrics between designing something that is reliable as a measure versus something that is highly valid. As an example, if you are rolling out some new training for employees, you could test them at the end of the class to see if they absorbed the information. This is a highly reliable test. You can get 100 percent participation, the test is inexpensive and therefore never compromised or cut, the comparison from one class to the other is consistent. It is a very reliable way to accomplish the task.

However, it is not very valid. Who cares if they remember the information IN the class? What I care about is what happens after that, and I care about their ability to apply it. It is much more valid to observe the person in their environment, ideally over multiple time periods, to see how they have integrated not just the information but the skill into their work. That is the most valid way to test the outcome of the class.

But this is not perfect. This kind of test is not as comparable, because the observations are not consistent, might be done by different people, and is certainly done under different conditions. It is also very expensive to spend that time just observing people’s time, so it is subject to shortcuts and being cut from the process entirely.

This is the tradeoff. This is not easy. To find a measure that is highly reliable but also highly valid is a challenge. Don’t just do what’s easy. Think about what is important to you about your measurement and then design the appropriate solution.

< Previous

Next >

Comments

Jamie,

So true that testing in the classroom does not guarantee the learned material will be used and implemented â€“ the validation. Very difficult to measure, therefore something organizations donâ€™t bother with (some donâ€™t even bother with classroom training anymore). Even on those projects that are relatively easy to check results on, many firms donâ€™t look back and ask, â€œHow did we do?â€ If they did, there is a good chance they would not implement the â€˜next great projectâ€™ until they understood why the â€˜last great projectâ€™ was unsuccessful.

The challenge comes in defining success up front. Even those projects that appear fairly straightforward â€“ This project will reduce costs by so many dollars, can become convoluted by changing economics and market conditions. For example, I submit a project to reduce costs in an area over the next year, but 6 months in, a new market opportunity develops in that area, and to meet it, I add resources and costs. The project may have allowed me to minimize the additional resources, but on an absolute level, my costs went up. If I defined success as a reduction in costs in that area, the project was, by definition, a failure.

That doesnâ€™t mean we shouldnâ€™t attempt to define success up front, but we do need to keep in mind the number of variables that can affect our definition.

Great Thoughts!

Glenn

Glenn Whitfield July 27, 2010 at 10:03 am
Jamie,

So true that testing in the classroom does not guarantee the learned material will be used and implemented â€“ the validation. Very difficult to measure, therefore something organizations donâ€™t bother with (some donâ€™t even bother with classroom training anymore). Even on those projects that are relatively easy to check results on, many firms donâ€™t look back and ask, â€œHow did we do?â€ If they did, there is a good chance they would not implement the â€˜next great projectâ€™ until they understood why the â€˜last great projectâ€™ was unsuccessful.

The challenge comes in defining success up front. Even those projects that appear fairly straightforward â€“ This project will reduce costs by so many dollars, can become convoluted by changing economics and market conditions. For example, I submit a project to reduce costs in an area over the next year, but 6 months in, a new market opportunity develops in that area, and to meet it, I add resources and costs. The project may have allowed me to minimize the additional resources, but on an absolute level, my costs went up. If I defined success as a reduction in costs in that area, the project was, by definition, a failure.

That doesnâ€™t mean we shouldnâ€™t attempt to define success up front, but we do need to keep in mind the number of variables that can affect our definition.

Great Thoughts!

Glenn

Glenn Whitfield July 27, 2010 at 10:03 am
Jamie,

So true that testing in the classroom does not guarantee the learned material will be used and implemented â€“ the validation. Very difficult to measure, therefore something organizations donâ€™t bother with (some donâ€™t even bother with classroom training anymore). Even on those projects that are relatively easy to check results on, many firms donâ€™t look back and ask, â€œHow did we do?â€ If they did, there is a good chance they would not implement the â€˜next great projectâ€™ until they understood why the â€˜last great projectâ€™ was unsuccessful.

The challenge comes in defining success up front. Even those projects that appear fairly straightforward â€“ This project will reduce costs by so many dollars, can become convoluted by changing economics and market conditions. For example, I submit a project to reduce costs in an area over the next year, but 6 months in, a new market opportunity develops in that area, and to meet it, I add resources and costs. The project may have allowed me to minimize the additional resources, but on an absolute level, my costs went up. If I defined success as a reduction in costs in that area, the project was, by definition, a failure.

That doesnâ€™t mean we shouldnâ€™t attempt to define success up front, but we do need to keep in mind the number of variables that can affect our definition.

Great Thoughts!

Glenn

Glenn Whitfield July 27, 2010 at 10:03 am
Glenn,

Thanks for the addition. I think that speaks to the problem of reliability versus validity. Fundamentally, those cost dollars are valid. However, it’s not very reliable because so many things can affect it. Many people will then focus back on a reliable success outcome, such as “the software will do what we said it will do” which is highly reliable but isn’t the real goal. Understanding the variables, as you mention, helps us define where on the reliable / valid line we need to be.

Jamie Flinchbaugh July 27, 2010 at 12:02 pm
Glenn,

Thanks for the addition. I think that speaks to the problem of reliability versus validity. Fundamentally, those cost dollars are valid. However, it’s not very reliable because so many things can affect it. Many people will then focus back on a reliable success outcome, such as “the software will do what we said it will do” which is highly reliable but isn’t the real goal. Understanding the variables, as you mention, helps us define where on the reliable / valid line we need to be.

Jamie Flinchbaugh July 27, 2010 at 12:02 pm
Glenn,

Thanks for the addition. I think that speaks to the problem of reliability versus validity. Fundamentally, those cost dollars are valid. However, it’s not very reliable because so many things can affect it. Many people will then focus back on a reliable success outcome, such as “the software will do what we said it will do” which is highly reliable but isn’t the real goal. Understanding the variables, as you mention, helps us define where on the reliable / valid line we need to be.

Jamie Flinchbaugh July 27, 2010 at 12:02 pm
Hi,
I would suggest that companies that choose to do the “easy things” is because the outcome metrics is compatible to their way of reporting and analyzing their projects. Good olÂ´Deming stated “You get what you measure”

I have followed “best in the class” engineers who simply cannot conduct much easier tasks (such as eg. calculations) than faced to during education. Perhaps because there are no “right answer at the end of the book”. On the other hand IÂ´ve seen lots of cases where the average guy outperforms the “best in class”. To my belief it is a living proof of being able to “do the right things” in stead of doing “things right”.

When implementing and working with new processes on the factory floor IÂ´ve used my own set of measurements, the “4C”; Commitment, constructive critisism and constancy. When these 4 (or actually 3) are in balance I know it is successful. You can have commitment, but without the two other C:s you are not going forward and then the implementation has not succeeded. When any implemented project starts getting constructive critisism it is a good sign. That means that the guys have started to think about improvements and that is a good sign of commitment.

These kind of metrics are, of course, hard to put into a bar or pie chart in a Power point presentation. The best way is to show it in “live action” at the Gemba. It has worked a couple of times for me, but trying to reach 100% compability between management and shopfloor is not easy.

Also the “project goal” should be clearly defined. A project without a goal is like going out driving with our car without a destination. You just drive until the tank run dry.

Jonas Holmlund July 28, 2010 at 1:52 am
Hi,
I would suggest that companies that choose to do the “easy things” is because the outcome metrics is compatible to their way of reporting and analyzing their projects. Good olÂ´Deming stated “You get what you measure”

I have followed “best in the class” engineers who simply cannot conduct much easier tasks (such as eg. calculations) than faced to during education. Perhaps because there are no “right answer at the end of the book”. On the other hand IÂ´ve seen lots of cases where the average guy outperforms the “best in class”. To my belief it is a living proof of being able to “do the right things” in stead of doing “things right”.

When implementing and working with new processes on the factory floor IÂ´ve used my own set of measurements, the “4C”; Commitment, constructive critisism and constancy. When these 4 (or actually 3) are in balance I know it is successful. You can have commitment, but without the two other C:s you are not going forward and then the implementation has not succeeded. When any implemented project starts getting constructive critisism it is a good sign. That means that the guys have started to think about improvements and that is a good sign of commitment.

These kind of metrics are, of course, hard to put into a bar or pie chart in a Power point presentation. The best way is to show it in “live action” at the Gemba. It has worked a couple of times for me, but trying to reach 100% compability between management and shopfloor is not easy.

Also the “project goal” should be clearly defined. A project without a goal is like going out driving with our car without a destination. You just drive until the tank run dry.

Jonas Holmlund July 28, 2010 at 1:52 am
Hi,
I would suggest that companies that choose to do the “easy things” is because the outcome metrics is compatible to their way of reporting and analyzing their projects. Good olÂ´Deming stated “You get what you measure”

I have followed “best in the class” engineers who simply cannot conduct much easier tasks (such as eg. calculations) than faced to during education. Perhaps because there are no “right answer at the end of the book”. On the other hand IÂ´ve seen lots of cases where the average guy outperforms the “best in class”. To my belief it is a living proof of being able to “do the right things” in stead of doing “things right”.

When implementing and working with new processes on the factory floor IÂ´ve used my own set of measurements, the “4C”; Commitment, constructive critisism and constancy. When these 4 (or actually 3) are in balance I know it is successful. You can have commitment, but without the two other C:s you are not going forward and then the implementation has not succeeded. When any implemented project starts getting constructive critisism it is a good sign. That means that the guys have started to think about improvements and that is a good sign of commitment.

These kind of metrics are, of course, hard to put into a bar or pie chart in a Power point presentation. The best way is to show it in “live action” at the Gemba. It has worked a couple of times for me, but trying to reach 100% compability between management and shopfloor is not easy.

Also the “project goal” should be clearly defined. A project without a goal is like going out driving with our car without a destination. You just drive until the tank run dry.

Jonas Holmlund July 28, 2010 at 1:52 am
Jonas, thanks for the comments. I think your line of practice is focused on validity. If people are truly committed to the outcome and the solution, you can be much more confident of it’s success. But it’s impossible to say “we’re 86.553 % committed to the outcome”. It just can’t be measured in a reliable way, so observation and engagement is required to even get a sense.

Jamie Flinchbaugh July 28, 2010 at 7:27 am
Jonas, thanks for the comments. I think your line of practice is focused on validity. If people are truly committed to the outcome and the solution, you can be much more confident of it’s success. But it’s impossible to say “we’re 86.553 % committed to the outcome”. It just can’t be measured in a reliable way, so observation and engagement is required to even get a sense.

Jamie Flinchbaugh July 28, 2010 at 7:27 am
Jonas, thanks for the comments. I think your line of practice is focused on validity. If people are truly committed to the outcome and the solution, you can be much more confident of it’s success. But it’s impossible to say “we’re 86.553 % committed to the outcome”. It just can’t be measured in a reliable way, so observation and engagement is required to even get a sense.

Jamie Flinchbaugh July 28, 2010 at 7:27 am

Blog