writing

Bias is a bad word

Dec 2025

The meaning of a word is its use in the language.

Ludwig Wittgenstein, Philosophical Investigations

The Label is the Argument

In the abortion debate, each side chose its own name. One calls itself pro-life; the other calls itself pro-choice. Neither side named itself anti- anything. Each picked the label that sounds principled and left the ugly framing for the opponent: anti-abortion, anti-choice.¹

The underlying positions haven’t changed. But the labels carry an emotional charge that the arguments themselves don’t. Pro-life sounds like basic decency. Anti-choice sounds authoritarian. Same person, same belief, different word — and you feel differently about them before they’ve said a thing.

We tend to worry about profane or hateful words. We shouldn’t — or at least, not only. The words that do the most damage are the ones that sound reasonable while smuggling in a verdict. Words that are imprecise, coercive, or carry several meanings at once. Words that chill discussion before it begins.

Bias is one of those words.

A Word Doing Too Much Work

Open any dictionary and “bias” offers you at least three different jobs.

In statistics, it means systematic error; a biased estimator consistently overshoots or undershoots the true value.
In psychology, it means a cognitive shortcut that leads us astray; confirmation bias, anchoring bias, the availability heuristic.
In everyday speech, it means prejudice — an unfair disposition against a group of people.

A biased coin and a biased jury share a word, but they share almost nothing else. The coin has no intent. The jury does. Yet when we say a machine learning model is “biased,” we borrow from all three meanings at once, and the listener gets to pick whichever feels most alarming.

When a researcher writes that a language model exhibits “bias,” what do they mean? That the model’s predictions are statistically skewed? That it has learned patterns we find socially unacceptable? That the model is prejudiced? The word lets you mean all of these simultaneously, which means it lets you mean none of them precisely.

What Models Actually Learn

A language model trained on a large corpus of text picks up the statistical regularities of that text. Tables typically have legs. Birds typically fly. Nurses appear as “she” more often than “he.” None of these are decisions. They are patterns, absorbed from millions of documents written by millions of people.

We need these patterns. The ability to represent that some things are more typical than others is not a flaw; it is the entire point. Yet when the same mechanism picks up that certain professions are gendered, we call it “bias” — the same word we use for racial prejudice and rigged courts. The model has no id to motivate malicious action. It reflects the statistical reality of the text we wrote down. And some of what it reflects is uncomfortable.

Caliskan et al. showed that word embeddings trained on ordinary English text reproduce the same associations found in human implicit association tests.² The model “knows” that flowers are more pleasant than insects, and also that European-American names are more associated with pleasant words than African-American names. Both associations come from the same mechanism. Both are accurate reflections of the training data.

The authors are extremely cautious about saying anything positive about these associations. They reach for “veridical” — meaning truthful — rather than risk saying a “bias” could be correct. A bad word forces good researchers into needless jargon, because the plain statement (“the model’s learned association here is accurate”) sounds like an endorsement of prejudice.

Priors, Not Bias

There is a better word, and it already exists: priors.

In Bayesian statistics, a prior is a belief held before seeing new evidence. It is neither good nor bad; just a starting point, the background knowledge you bring to a problem. Priors can be updated. They can be wrong. The word carries no moral judgement.

When we say a model has priors instead of biases, several things change. The moral panic subsides. Priors are something to examine, not something to condemn. The conversation shifts from blame to mechanics: where did this prior come from? Is it accurate? Should we update it? And we stop pretending the model is doing something the training data didn’t. The model has priors because we have priors; it learned them from us.

Accuracy is Not Aspiration

“Bias” makes one thing difficult to say plainly: accuracy and aspiration can point in different directions. Certain professions are gendered; the model picks up on this because the data says so. But we don’t want it to be the case. We want a world where “nurse” and “engineer” carry no gender signal. The model’s prior is accurate; we just wish the world it reflects were different.

Not all priors are facts — there is no universal rule that insects are unpleasant — but they all reflect the text we have written down.

The Blame Belongs Elsewhere

Learned priors can cause real harm. A hiring algorithm that associates “engineer” with “male” reinforces the pattern it learned. A language model that completes “The criminal was” with a racial stereotype amplifies the association each time it runs.

But calling these “biases” misplaces the blame. It makes the model sound like a bigot when it is closer to a parrot. You can be biased; it just has priors. The machine picked up our habits. Much like a child who learns a swear word — it isn’t wrong to learn it; we just wish it hadn’t.

The distinction changes what we do next. If the model is biased, the fix sounds like moral correction: make it less biased, remove the bias, debias it. If the model has priors, the fix sounds like engineering: identify which priors are harmful, decide what to replace them with, and update accordingly. One framing invites outrage. The other invites solutions.

Teaching Machines, Teaching Ourselves

Perhaps, much like children, we can teach machines to hold better priors than the ones we gave them. But we won’t get there by calling every learned association a bias and every pattern a prejudice. We’ll get there by being precise about what models have learned, where those patterns came from, and what we want instead.

The first step is easy. It costs nothing. It is just a word.

When you want to say “bias,” say “priors” instead.

Brent Hodgeson explores this in Labels Matter. The pattern extends beyond abortion — he traces how the same framing trick plays out across vaccination, politics, and data science. ↩
Semantics derived automatically from language corpora contain human-like biases.
Caliskan, Aylin, Joanna J. Bryson, and Arvind Narayanan. Science 356.6334 (2017): 183-186. ↩