Going From Accuracy to Loss Measures

And why enterprise AI kept optimizing the wrong objective for so long

Jan 09, 2026

It’s going to take something else to get up this one. Image generated with Leonardo AI

Every week, it seems, I see a new announcement. A data science team somewhere has built a model with 95% accuracy. Maybe even 99%. Everyone high-fives, the project gets a green checkmark, and executives hear that their business is now “AI-driven.”

But then, a few months down the line, something funny happens. The business users are quietly ignoring the model’s outputs. The promised ROI never materializes. The model, for all its benchmark-crushing glory, is operationally irrelevant.

What went wrong? It’s not about “organizational resistance” or “lack of adoption.” That’s the easy excuse. The real problem is that we, the data scientists and analysts, have been chasing the wrong thing. We’re obsessed with accuracy, but accuracy is just a proxy for what really matters. It’s like a soccer team that celebrates having 80% ball possession but still loses the game 3-0. You won on a metric that doesn’t decide the outcome.

The real game isn’t about accuracy. It’s about minimizing loss. And if you don’t explicitly tell your model what a “loss” looks like in the real world, it will make its own assumptions. And trust me, it will almost certainly get it wrong.

The Problem with Perfect Scores

Here’s a little secret: machine learning models don’t actually optimize for accuracy. They’re built to minimize a loss function. This function is where we’re supposed to define what an error is, how much different errors hurt, and what trade-offs we’re willing to make. Accuracy is just a simplified, often misleading, summary of that process.

Its biggest, and most dangerous, assumption is that all errors are equally bad. A false positive costs the same as a false negative. This is a neat and tidy mathematical convenience, but in the real world, it’s almost never true.

Think about it in finance. Imagine you’re building a model to predict loan defaults. A false negative means you deny a loan to someone who would have paid it back. That’s an opportunity cost, and it stings. But a false positive means you give a loan to someone who defaults. That’s a direct, and often massive, financial loss. They are not the same. By optimizing for accuracy, you’re telling your model they are, which is a rookie mistake no seasoned analyst would ever make.

Your Loss Function Is Your Strategy

Choosing a loss function isn’t a technicality you leave to the defaults in your favorite library. It’s a strategy session. It’s where you answer the big questions: what kind of mistakes can we live with, and which ones will kill us?

Too often, we just grab what’s familiar—Mean Squared Error, Cross-Entropy—because they’re easy to work with. We let mathematical convenience dictate business strategy. The result is a model that’s a genius in a vacuum but a fool on the field. It’s disconnected from the very decision it’s supposed to improve.

This is where we get burned by the gap between empirical risk (how the model does on the data it’s already seen) and expected loss (how it will perform on the messy, unpredictable data of the future). Accuracy loves looking in the rearview mirror. It tells you how well you did on yesterday’s game. But it tells you almost nothing about how you’ll do tomorrow, when the opponent has changed their formation.

The Real World Is Always Asymmetric

This isn’t some niche, corner-case problem. Asymmetry is the name of the game in any decision that matters. This is especially true in finance and the rapidly growing field of sustainability, where the stakes can be enormous.

| Domain | False Positive (The “Oops”) | False Negative (The “Oh No”) | The Real Cost | The Real-World Impact |

| --- | --- | --- | --- | --- |

| Credit Risk | Deny a good loan | Approve a bad loan | High | You miss a goal vs. you score an own-goal. |

| Fraud Detection | Annoy a good customer | Miss a huge fraud case | Medium | A yellow card vs. a red card and a penalty. |

| Climate Modeling | Over-invest in green tech | Underestimate climate risk | Very High | A costly substitution vs. getting relegated from the league of habitable planets. |

| ESG Investment| Wrongly flag a good company | Miss a company hiding a toxic culture | High | A minor PR headache vs. a full-blown portfolio disaster. |

In sustainability, underestimating the risk of a climate event isn’t just a bad quarter; it’s potentially catastrophic. The cost of being wrong in one direction is existential. Building an AI to guide sustainability strategy with a symmetric loss function isn’t just bad data science; it’s a failure of responsibility.

That 0.5 Threshold? You Just Made a Huge Decision.

Nowhere is this lazy thinking more obvious than with the default probability threshold of 0.5 in classification models. By using that default, you are explicitly stating that a false positive is exactly as bad as a false negative. You just made a massive strategic decision without even thinking about it.

Honestly, tweaking your decision threshold will often have a bigger impact on your business than spending another month trying to squeeze 0.5% more accuracy out of your model. The threshold is your risk tolerance, made tangible. It’s not a technical afterthought; it’s the decision itself.

The Seductive Comfort of a Meaningless Number

So if accuracy is so flawed, why are we all still so obsessed with it? Because it’s easy. It’s a single number. It looks objective. It lets you avoid the hard, messy, and often political conversations about what the organization truly values and what risks it’s willing to take.

Formalizing a loss function forces you to have those conversations. It makes you put a number on how much it hurts to lose a customer versus how much it hurts to let a fraudulent transaction slip through. It forces clarity. Accuracy provides plausible deniability. A custom loss function exposes the stakes. It’s no wonder so many organizations prefer to stick with the comforting lie.

From Model-Obsessed to Decision-Driven

So how do we fix this? We need a new playbook. One that starts with the decision, not the data.

Before you write a single line of code, ask these questions:

What’s the actual decision we’re trying to make here? (Not “predict churn,” but “decide whether to offer a customer a 20% discount to prevent them from churning.”)
What are the possible actions we can take? (Offer discount, send an email, do nothing.)
What are the real-world costs and benefits of every possible outcome? (What’s the cost of the discount? What’s the lifetime value of the customer we save? What’s the cost of annoying a happy customer with a discount?)
Who is on the hook for these costs? (The marketing budget? The sales team?)

Once you have those answers, then you can start thinking about a model. Accuracy becomes a constraint, not the goal. The model needs to be good enough to be better than a coin flip, but your real objective is to build a system that makes the best possible decision by minimizing expected loss.

This is a fundamental shift. It’s a move from being a model-builder to being a problem-solver. It’s less about celebrating a high score and more about building a robust, reliable system that wins, quietly and consistently, in the real world.

The challenge for all of us is to lead this change. It’s time to stop chasing the seductive glow of a meaningless accuracy score and start optimizing for what truly matters. It’s time to decide what game we’re playing, and then play to win.

Discussion about this post

Ready for more?