WHEN AI GETS THE RIGHT ANSWER FOR THE WRONG REASON
- Mar 24
- 2 min read
An AI system passed every science benchmark. Strong results were reported in Nature. Yet it was categorically wrong.
The system was supposed to predict which drug molecules bind to a biological target - a critical step in getting new treatments to patients.
What it actually delivered was the preferences of individual chemists.
The training data was shaped by their habits. Certain chemists specialise in certain targets. They produce structurally similar compounds. They get good results. The model learned to recognise these chemist's style - not molecular behaviour.
Nobody caught it from the outputs. The predictions looked right. The benchmarks looked right. It took people who understood the science - deeply, intuitively, from years of practice - to see that the model had learned a shortcut through the data rather than the thing the data was supposed to represent.
The implications stretch well beyond drug discovery. Digital intelligence can score perfectly on every test and still be encoding a proxy - the pattern that correlates with the answer rather than the pattern that causes it. In insurance underwriting, that's a pricing model that looks accurate until the claims come in. In financial services, it's a compliance screen that learned formatting conventions rather than regulatory logic.
Which is exactly why the agentic AI era needs to be one in which we build more and better human subject matter expertise, not make it redundant. The underwriter who spots the proxy learned pricing by doing pricing. The radiologist who catches the AI's mistake learned to catch it by reading thousands of scans without AI assistance. Human intelligence in partnership with digital intelligence is nearly always the best answer.
Yesterday we wrote about the gap between AI exposure and AI displacement - how the roles most touched by AI may grow rather than shrink. And this is an example of why. The roles most exposed to AI are often the ones whose judgement you need most to keep AI honest.
So what's today's in-the-end-at-the-end?
The model learned the chemist, not the chemistry. The only person who could tell the difference was someone who already knew the science.
Protect that person.







