Ep.14: Outcomes Over Output
Remember, you can always listen here or follow us on Apple Podcasts or Spotify. Either way, thanks for supporting us.
About this episode. It’s easy to measure output. How many trainings. How many policies. How many messages. All "the things" we did to protect the company and shape a culture of ethics and integrity. But did any of those things work? And what does it even mean for something to “work"? In this episode of The Better Way?, Zach and Hui focus on the distinction between output and outcome—and advocate for a mindset of experimentation and verification.
Drawing parallels to healthcare and public safety, they emphasize that true effectiveness involves behavioral change, risk reduction, and cultural alignment. But they also acknowledge that this is hard—so much harder than tallying up output. From FOFO (fear of finding out) to data access challenges, they unpack (some of) the reasons why compliance often falls into the output trap. And in the end, they call for a fundamental shift in mindset: one that starts with defining your desired outcomes, collecting meaningful data, and being brave enough to test whether what we’re doing is working.
Who? Zach Coseglia + Hui Chen, CDE Advisors
Full Transcript:
ZACH: Welcome back to The Better Way? Podcast brought to you by CDE Advisors. Culture. Data. Ethics. This is a curiosity podcast for those who ask, “There has to be a better way, right? There just has to be.” I'm Zach Coseglia and I am joined as always by Hui Chen. Hi, Hui.
HUI: Hi, Zach. How's it going?
ZACH: Welcome. It's going well. I'm excited for today's discussion. It is just us today, but this is a topic that I know, well, you're frankly one of the world's leading voices on. So it's really fun to be able to talk about it with you. And it's a topic that you and I and many others have been talking about for a very long time. Hui, why don't you go ahead and introduce the topic for today?
HUI: Indeed, this is something I've certainly been writing about for a long time, which we now reference as output versus outcome. And I used to call it efforts versus outcome, but you have, I think, reshaped it into output versus outcome, which I like so much better.
The issue that we're trying to address here is how we measure effectiveness. And this of course is a question that we've all been asking for a long time. And I've always in my time of speaking and writing, that's probably the number one topic where people want to hear about is, how do we measure our effectiveness?
And in trying to answer that question, in looking at the way we have been trying to measure what we've been calling effectiveness, it became clear that what we were measuring was output. It was how hard we tried to achieve a certain result. So, we're measuring, for example, things like training, attendance rate—and I think we all know that we train people not for the purpose of them appearing at training. That's not what we're hoping to achieve with the training, but what we're hoping to achieve with the training can be a number of things, like imparting knowledge to people who attend the training, like getting them to change their behavior, getting them to understand how to tackle issues, and how to analyze the dilemmas that they face.
We don't measure any of those. We just measure that they came to the training; and this caused us, you know, you and me and others who are interested in this issue to start looking at what else are we measuring that's really just how hard we tried? It's output, all the work that we put out in trying to achieve something, yet we don't measure what we actually achieved. And that really started a journey that we're still very much on.
ZACH: Absolutely. You know, it's funny, we were doing a compliance program assessment for a client not long ago. You'll remember this, I'm sure top of mind. And we were sharing the results, we were reading out to them. And in fact, preparing to present the findings to leadership and ultimately to their board. And one of the pieces of feedback that we got from the client was, “well, wait, wait, wait, the way that you're describing or the way that you're assessing these elements of our program seem like we're not getting enough credit for all of the work that we did. It seems like there's not enough credit for all of the output, all of the things that we did to get where we are today.”
And you and I just sort of took a pause and said, “exactly.” We want to credit hard work, and for sure the design and the effort should be part of the discussion, but it can't be the end of the discussion. And in fact, it shouldn't be even the balance of the discussion. Most of what we talk about in our world should be about the outcome that we achieve, the effectiveness of those efforts, and that's ultimately what we're going to talk about here today.
HUI: And that's what . . . I think the most important thing to keep in mind is, that's what all your business stakeholders are going to judge you on—and that's what they judge themselves on. There is not a single sales organization in the world that reward salespeople based on how many hours they try to sell something instead of what they actually sold.
And if you say, okay, that's not fair, that's sales. Look at support functions like marketing. Same thing. Marketing departments are not rewarded for how many marketing campaigns they run. They sit down and calculate incremental revenues that are attributed to specific marketing campaigns. So it's how much additional sales can be attributed to their efforts, and to their outputs of efforts. So really nobody else in the major stakeholders in corporations are rewarded and evaluated based on their outcome, not their output. So, if we are going to earn that seat at the table, if we say we deserve that seat at the table, we have to do the same thing.
ZACH: Yeah, absolutely. One of the things that I did in preparation for the discussion was, I just sort of tried very quickly to come up with a list of places outside of compliance, sort of outside of our domain, where it just doesn't make any sense to measure performance based on output alone or effort alone. So one of the ones that came immediately to mind, because this is something you talk about all the time, was healthcare, right? It's not about the surgeries that we perform, it's about the actual patient health outcomes.
HUI: It's not about how often the surgeons wash their hands. It's about the actual outcome. So, they do . . . hospitals actually do measure those things about the hand washing practices, about communications between the surgical team and the care team, between the surgeons and the nurses. They do measure output, but they also do measure outcomes.
And think about yourself as a patient. Do you care more about the surgical outcome when you choose the hospital or a surgeon? Do you care more about how many patients successfully complete the surgery and came home without complication? Now all the work in terms of hand washing, communications and you know, drug dispensation after surgery, all of those, they contribute to those outcomes. But as a patient you don't care about those, you just want the outcome.
ZACH: Right. Absolutely. I thought about public safety. So, it's not about how many arrests were made—in fact more arrests might actually suggest more crime. It's actually about have we reduced crime? Have we built community trust? I thought about social services and hunger stability programs. You know, it's not about just how many meals were served, it's about how we've actually created stability, whether we actually are tackling the broader, longer-term hunger issues or crises. I thought about software. It's not about how many lines of code you've created. It's about whether or not that code is actually bringing business value and having a user impact.
So, it's like everywhere we look we see the value, we see the obviousness of measuring outcomes, and yet it continues to be less frequent in this community in the compliance space. And I want to talk about why, why do we think compliance often falls into the output trap?
HUI: I think one of the reasons is, in order . . . well, I can think of two reasons. One is, we often have not really defined the outcome that we are trying to achieve. And you and I see this as we work with clients. You know, again back to training it’s a very simple example. People say, “oh, how do we measure the effectiveness of your training?” And we say, “well, effective towards what? What are you trying to achieve with this training? What is the difference this training is supposed to make in the people's lives?” And oftentimes we just get complete sort of blank stares on that question.
In one case, I actually appreciated the honesty. The honesty was, well, we do training to satisfy regulators. Now that is a very clear outcome you want. You want to have a documented training to show the regulators or enforcers—in which case training attendance rate is exactly the right metrics is that you're actually trying to . . . you have the ability to make 90% plus of your Company go through this training. You document it and you show the people you want to show that that's what you're able to do.
But outside of that case, if you do a training, if you design a training, the first step needs to be articulating what is the outcome that you want that training to achieve/ Because if you break it down, there are different kinds of trainings and different kinds of trainings are aiming to achieve different things. So that step oftentimes is skipped, not even thought about. That I think is reason number 1.
Reason number 2 is, in order to measure some of those outcomes, you need a lot more data points, and many of those data points or data sets are not within the control of the compliance departments. So it's the inability to one, come up with what are the data that would help me understand whether we have achieved the outcome. And two, how do I access those data?
ZACH: Let's unpack that a little bit. The first thing that you referenced was different types of effectiveness or trainings are designed to do different things. Give us a couple of examples.
HUI: So one type of training may be the general code of conduct trainings. Which you do . . . oftentimes the companies do it annually, and the purpose is for people who have never been exposed to it to understand what's in there; and for people who have already taken this in the past, as a refresher. So in that one training, you have two different objectives for different participants of that training. That's one category.
Another category could be you're rolling out a new set of procedures for something, let's say approval system for a certain type of expense. You're rolling out a training relating to that type of specific set of new procedures and the outcome that you want from that training is that people understand how to navigate those procedures. Those two different trainings have potentially 3 different outcomes depending on your audience.
ZACH: Yeah, for sure. And potentially others as well. I mean, I think about code of conduct training as potentially being a really wonderful place and way to evaluate alignment of values between your employees and the company, to determine whether or not decision making is going to be made in ways that are consistent with those values—beyond just understanding and where you kind of sit in terms of your journey within the company.
Let's talk about the data piece, because I think you're absolutely right, that sometimes that data is outside the reach of compliance, even if it's still data that can be influenced by compliance. But I think that the data piece actually is a little bit more complicated, at times, because I feel like sometimes the concern around collecting data that speaks to effectiveness actually goes to this concept that we talk about from time to time called FOFO. The fear of finding out.
Now I always say to folks when they say that they're training or even any other element of their program is to be able to show regulators that we did it. And you're right. Training completion is a way to show that, but what do you do if the regulator or the enforcer asks the follow-up question, which is, “but did it work?” So there's that.
HUI: Absolutely.
ZACH: But then there's this FOFO issue, which is related. And that FOFO issue is, well, we're concerned about collecting data that actually shows that what we're doing isn't having the broader impact that we hope it will. Now that may seem like an understandable concern, but to me, it's actually kind of insane.
HUI: Indeed.
ZACH: It's kind of insane to me that you're concerned about having data that shows that what you're doing isn't working, rather than being concerned that what you're doing isn't working.
HUI: Oh, it's very much like saying that I don't feel well, but I'm afraid to go to the doctor to find out that I'm actually sick.
ZACH: 100%, yeah.
HUI: Yeah, so you can't get treated. If you happen to just be not having a good day, lucky for you. But if you actually are sick and you take that approach, you're just going to get sicker.
ZACH: Yeah, 100%. And look, I think that part of the concern around training, effectiveness, even culture, performance and some other elements of compliance, I think part of the data challenge is some of the ways that we might collect that data through questionnaires, through surveys, through pulse checks, there is a lot of perceived fatigue around these things. Now, those are by no means the only ways that one can measure things, and we'll talk more about measurement strategies in a bit. But I get and I hear a lot that there's fatigue around some of these means of collecting data.
But what I always say to people is, the people who are perhaps concerned about the fatigue are the same people who expect you, or who should expect you, to be able to prove the desired outcomes, to be able to prove the effectiveness of these things. And so I feel like we've got to be thoughtful and mindful about the fatigue, but we've also got to get past it because they are good, meaningful ways of collecting data, whether it's the quizzes that we can embed within training, or whether it's follow-up questions, or whether it's pulse checks. These are ways in which we can get data at scale, and there are few other ways to get data at scale like that for some of the types of activities that we're talking about.
HUI: For some of the types of activities, but I think there are certainly, again, depending on what the objective is, there are certainly other ways. So, if we go back to that example of your training for a new set of procedures that you're rolling out, what you really need to do is monitor, you know, breaches of that procedure. In fact, you don't need to quiz anyone. Even at the end of the . . . I'm not a big fan of the quiz at the end of the training sessions because I think they test nothing but short term memory, right?
ZACH: Yes. I agree with that.
HUI: So, what is your whole point for that type of training so that they know how to use this new set of process and that you have rolled out? So let's see if they do.
ZACH: Yeah, for sure.
HUI: It's that simple.
ZACH: But this goes to a bigger . . . I will say on the point about short-term memory, I fully agree with that. I also find that many of the quizzes that you see embedded within trainings are almost comically easy.
HUI: Yes.
ZACH: And therefore, not doing much of anything for anyone. And so, much as we always say around culture surveys, there's a better way to do it. I think that we've tapped into some of those better ways, and I feel the same way about quizzes within training. It's like if you're going to do it make it at least something that's going to be thought-provoking, that's going to get people to think--and I'm much more interested in questions that aren't necessarily quizzing on process, but that are telling us more about individual priorities associated with decision-making.
HUI: So, we've been using training as example, but this applies to every aspect of compliance and in fact I think it applies to the compliance program as a whole. Compliance programs exist for very specific desired outcomes, which is to prevent, detect and remediate misconduct. There's actually programmatic goals against which measurements should be taken about effectiveness towards those goals. But let's take another component of compliance program that people love to talk about, which is policies and procedures.
What is your purpose for writing policies and procedures? Again, you can . . . in some cases people can honestly say it's really just there so we can fire people when they violate them. So, we have a piece of documentation that says you're not supposed to do such and such; and when they do such and such we say, look at this policy that's clearly prohibited, so now you're disciplined. If that's your purpose, again, they don't even need to understand it. You just have to have documentation that they actually are aware of these prohibitions and you're done.
But if your policy is really aiming to make clear what your expectations are in this area of behavior, which is what I think oftentimes, certainly compliance related policies are supposed to be aiming to do, then isn't it important that you look to see whether people do the things that you expect them to do? And this is where I think a lot of people say, “I just don't know how to measure that.” I can see, I have heard a lot of people say, “yes, I agree that's the desired outcome for these policies, but I don't know how to measure that”—because now we're asking people to measure behavior, awareness, beliefs, and we don't know how to measure that.
So interestingly, certainly in one case we've had researchers study the effect of policies in terms of the traditional goals of, do people understand what we're trying to say? Do they understand what is the policy objective that we're trying to articulate? And also do they behave according to the policy expectations that are laid out? You always ask me to talk about that study, but I'm going to reverse that. I'm going to ask you to talk about that study done by our friend Benjamin Van Roy and his collaborating researchers.
ZACH: Well, I'm going to paraphrase, so hopefully I do it justice. But the short of it is they wanted to evaluate employee reactions and the effectiveness of different versions of an anti-corruption policy. And so, there was a pretty typical 19-page—I think it was 19 pages, pretty typical anti-corruption policy. Written in legalese, dense text. It's probably what most of us see when we look at anti-corruption policies. There was a shorter version of it that I think was about 3-pages long, much tighter, but still written in traditional legalese language.
Then there was a third version that was a one-page infographic. And then there was actually a fourth control group of people who received no policy at all. So this was conducted very much like a clinical trial. You had four arms, a control or placebo arm, I guess, where they got nothing and then three different versions of the policy. And the goal was to figure out which one of these policies was most effective. And they looked at it both from the perspective of . . .
HUI: Most effective in achieving understanding of the expectation and in changing behavior. So just want to add that in.
ZACH: Exactly. And we love this study and we talk about it a lot in different contexts. And we always ask people, which version do you think worked the best? And we get different responses. I think that the version that most people pick, though, when they're being honest and they're not trying to game the experience . . . the option that most people pick is the infographic.
HUI: Yes.
ZACH: That there is this sort of assumption that when we make things shorter, when we make things pithier, when we utilize visuals as opposed to just words, that we create a better product for the people who are consuming said policy.
But that is not what the research found. So Hui, what did the research find?
HUI: We're going to do a dramatic pause. The research found, get this, no difference among the four groups, no statistically meaningful difference among the four groups. Let's take that in for a second.
ZACH: Yeah. Now sometimes when we present that, we see a lot of troubled faces. People think, well, what are we doing then. It's almost existential. But it shouldn't be. Because what they actually found through a series of controlled exercises or games that they had the participants play was that there was something that made a difference in how people made decisions or how people behaved. It wasn't the length of the policy or the pithiness of the language. It was social norms.
People were more likely to behave in an unethical way or in an ethical way when they saw their peers and colleagues behave in one of those ways. So it wasn't the policy that really made the difference. It was really the culture. It was the people. It was what they saw to their right and to their left on that org chart. And that to me is really powerful, but also really inspiring. Because when you present the topic of, well, what is the effectiveness of our policies? My instinct, my hot take is, that policy doesn't matter.
It's just a piece of paper. Few people are actually going to read it, unless they actually need it. And I think a lot of people, when they actually need it, are probably going to seek out the answer somewhere other than that piece of paper. And so, what I am always interested in when we're talking to people about their policies or some of the written elements of their program is less so about what is on that page and more so the context in which that piece of paper exists.
HUI: So true. So I think to also just to unpack a few things there too. I can't cite you the research studies, but there's certainly a common belief that people turn to their colleagues on policy related questions first. I know that's true for me. When I am in a new organization, I want to know how to do something, I would ask someone. Now being who I am, a compliance lawyer, I will also seek out the policy to just verify the information that I've heard. Now I did sort of do something like an experiment when I was at DOJ. A couple of things I did at DOJ was when companies come in to present on their compliance program, they would bring in this reams and reams of papers, their policies and procedures. And when they bring in those binders, I kind of roll my eye and I say, if you can show me one single person in your company who has read those, other than the people who wrote them, then I will read them. Otherwise you can take them back. And this was actually the genesis of the Evaluation of Corporate Compliance Programs. Because I want to stop those killing trees for really no good purpose.
So, the other thing I did, just to drive this home point home to the prosecutors that I worked with, I killed some trees. I went to the regulations and printed out all the travel related regulations that govern travel requirements for federal employees and for DOJ lawyers. It was a lot of paper. I mean, it was a good stack of paper. And people . . . the prosecutors in the fraud section, they travel a lot, not surprisingly. So, I went to one of their trainings and I said, you know, how many people here have read—and I cited, a regulation and I showed the pile of paper—and they're like, what's that? And I said, “These are the regulations that govern your travel,” and they kind of looked stunned. And I said, “How do you know you're in compliance with these regulations?”
And they said, “We asked the travel desk. We arranged all of our travel through the travel desk.” And I said, “When you were new, when you didn't know about the travel desk, what did you do?” They said, I asked someone on my team, what do I have? You know, I have travel coming up. What do I do? They say go to the travel desk. I go to travel desk and I assume they take care of whatever they need to do to meet all the requirements in that stack of paper that I was showing, right? So nobody had read those regulations.
ZACH: Yeah, of course.
HUI: They relied on a process. They relied on their colleagues to turn to that process, certainly in that little experiment that I ran. It's not really an experiment. It's like a little survey that I did, a live survey. That was how people approached. So again, back to what are you trying to accomplish with your policies? Now with this particular experiment, I kind of wish they didn't go with anti-corruption because anti-corruption is kind of instinctive to most people, even if people who engage in them know that they're not supposed to.
If they had tried this experiment with something a little more or a little less intuitive, like money laundering. Which is not . . . most people, I think, don't know, what does money laundering look like? How does it work? And you know, how are we involved? This is so much less intuitive to most people that I would be curious to as to what the result of that would be.
ZACH: Yeah, I agree. The other thing that I want to emphasize or maybe take the conversation to is this isn't just about evaluating individual elements of the program, like training or policies or risk assessment. It's also about investing in, measuring the overall effectiveness of the program as this interconnected set of activities and imperatives. And I've heard you say this many times over many years. But the end goal is to prevent, detect and remediate misconduct. That's why we do what we do.
And so, we also need to make sure that we're measuring against that when we look at all of these things together. And I think that this is actually the part that's conceptually more difficult for folks to really get their head around. It's hard. I just want to say that first and foremost, it's hard. But the beauty for us is that there is no one way to do this. There is no right way to do this. But what I encourage people to do and to think about when trying to measure those overarching programmatic outcomes of prevent, detect, remediate misconduct is to just sort of take stock of all of the data that you have.
It could be data around employee perceptions from your culture surveys. It could be employee approaches or data that you have on dilemma-based hypotheticals that maybe you embed in your trainings. It could be about the nature and occurrence of control breaches that you have from audits or from monitoring or from investigations. It could be transactional anomalies or high-risk expenditures that you have from analyzing your financial data or analyzing other transactional data that speaks to a particular area of risk like, loss or like security breaches or whatever the area of the program that you're evaluating. It could be looking more generally at audit exceptions. It could be looking at reporting stats. So. not necessarily what is an investigation, but what's being reported and what we know about the people. And the types of issues that are being reported. And then of course looking at investigative data. So just take stock of that data and be creative around solving the problem or coming up with an equation that takes all of these various data pieces and tells a story about how well the program is doing. Again, there is no one or right way to do this.
But to me, it's more than just presenting a series of KPIs. It's transforming that data into some sort of a mathematical equation or series of equations that you use to capture and tell a story about your program's effectiveness.
HUI: Very much so. I think to the extent that compliance professionals have now begun to work with data, what we often see is a data dump. So for example, you know we do program reviews and we often as part of our review, we see people's presentation to their senior management, to the board, and we see things like a list of charitable donations.
Well, I can't imagine that makes a whole lot of sense to the people on the board. I mean, is that list that good? Is that bad? Is it completely not contextualized? What does it mean? Are they violating the policy? Are they not violating the policy? What is exactly? Why am I even looking at these numbers? And a lot of the struggle, I think, with the data also comes from this lack of definition in the beginning of every project of what is the outcome we're trying to achieve.
ZACH: Yeah.
HUI: So, I think if you are to walk away from this podcast with anything, is to always ask in the beginning of everything you do, what is it that I'm trying to accomplish with this?
ZACH: Yeah. I think that what folks will find . . . I think most of our clients would attest to this. When we're in very early discussions with people there are a couple of things that we will pretty much routinely do if we're, for example, going to do a program assessment and we're having an introductory call where we're maybe talking for the first or second time, we'll often ask folks, how's it going? So tell us about your risk assessment. How do you feel about it? And we'll often hear people say, we feel really good about it. We think it's . . .we feel like it's doing exactly what we intended to do. Well, the first question we'll ask will usually be. What do you intend it to do? And the second question will usually be, why do you think it's working?
HUI: Yep.
ZACH: And honestly, we get good responses. A lot of the time we get very qualitative responses, which by the way is data as well. I don't want to discount that.
HUI: Absolutely.
ZACH: We also sometimes get silence. And rarely do we get quantitatively driven analysis. And so, I don't know. My take away from that is a couple of things. One, we just need to always be checking our assumptions. If we start a sentence with, “I think,’ that's a red flag. Why do we think that? What evidence do we have to support that? When we make a statement about an outcome; this is good, this is working well, people like this. We should be curious and challenging of those assumptions to understand why.
And so whenever I hear someone say, think about using a more engaging, humorous approach to your training. We did these cartoons or we have this puppet. My question is always okay, that sounds fun. That sounds exciting. I'm excited about that. But how do we know it's actually achieving a different outcome than something else? The other thing that comes to my mind when I see some of the discourse in our community is, there isn't a silver bullet. Just because something worked in one place doesn't necessarily mean it's going to work in another.
HUI: Very true.
ZACH: So even if there is some data to support that articulated “better way,” we still need to make sure that we're doing our diligence to measure how it might work or how it does work within our own shop. And I think this is one of the challenges that I have sometimes with the way that behavioral science is interpreted within the compliance community.
I fear sometimes being misunderstood as, well, this study says this, and so that's exactly what's going to happen within my world. But to go back to something that we said at the beginning and a topic that we talk a lot about, context matters. So it might not, because your culture is different, your people are different, your business is different, lots of things are different. And so we need to be aware of that, test it, see how it goes, and then go from there.
HUI: I think this spirit of experimentation—or you can call it the spirit of verification—is so important. So, we can take that Benjamin's study that Benjamin and his colleagues did to a very different culture and very different organization and maybe we'll have a different result then what he found, right? But how do you know? How do you defend this against any anyone's sort of question about, is your program working? Is your training working? Is your policy working? Is your risk assessment working? To answer those questions, what would really be helpful is, one, articulating the outcome, two collecting the data, three, be willing to experiment and verify.
ZACH: We see it from time to time, but it still tends to be I think a topic that folks are aware of and striving for but struggling with.
HUI: I think the first time I raised this idea was actually at a very small academic conference with a bunch of researchers, you know, compliance researchers. And when I raised the question about we should be measuring prevention, detection and remediation, one of them immediately said, but you can't measure prevention.
Well, everybody else in any kind of preventive industry does public health and safety, all of those industry measure prevention. So what can we learn from them? And a lot of it is through that verification and experiment procedures that we talked about.
ZACH: Hui, this is a topic that you've been talking about for a long time, that we've written about together and that we very much will continue to talk about in this space. So it's been fun. Thank you as always.
HUI: Thank you. This is so important I think for our profession.
ZACH: Absolutely.
ZACH: Thank you again, Alison. This has been great. And thank you all for tuning in to The Better Way? Podcast. For more information about this or anything else that’s happening with CDE Advisors, visit our website at www.CDEAdvisors.com, where you can also check out the Better Way blog. And please like and subscribe to this series on Apply or Spotify. And, finally, if you have thoughts about what we talked about today, the work we do here at CDE, or just have ideas for Better Ways we should explore, please don’t hesitate to reach out—we’d love to hear from you. Thanks again for listening.