Correlation, Causation, and Law Enforcement

Charles Fain Lehman writing for National Review “Progressives Are Overreacting to a Startling Crime Study” April 18, 2021, discusses a recently published paper on the prosecution of misdemeanor crimes.

To oversimplify, the paper concludes that prosecuting relatively minor offenses dramatically increases the odds of the person prosecuted becoming a repeat offender. The devil, however, is in the details.

Lehman does good work in picking through the details of the paper. He does not refute it. He, rightly, points out that some are probably putting too much emphasis on it (I suggest you read the entire NR article).

I am not qualified to comment on the accuracy or relevance of the conclusions in the article Lehman wrote, or the conclusions in the paper. However, Lehman points out in the article that the authors of the paper suggested that the odds of an offender reoffending going up may be the result of the criminal record making that person less employable. This leads me to ask: What is the causal factor for criminal behavior? According to Lehman, the paper concludes it is unnecessarily prosecuting low-level offenses. Is that the case or is it that the person is unable to find a job? One may cause the other, but which is the causal indicator? People can have difficulty finding a job without having a criminal record, I would therefore argue that the two are not necessarily inseparable.

Correlation and causation are not the same, not even close. Insurance companies live and breathe statistics. If you are a young driver, just got your license and unmarried, you can expect (all other things being equal) to pay much more for car insurance than someone 30 years old, married, with a clean driving record. The insurance companies’ statistical analysis says younger drivers are at higher risk – and this is statistically true. But age is not the causal factor, behavior is. There are responsible 16-year-olds, and there are irresponsible 55-year-olds.

Take a look at conviction rates. Federal or state, doesn’t matter. There are some pretty staggering differences in the conviction rates for different categories of offense. Some as low as single digits (6%). A reasonable person might conclude several things from such statistics. A certain category of crime may be very hard to prosecute, standards of proof may be difficult. One might also conclude that in a case where the conviction rate is as low as 6%, that in 94% of the cases, no crime was committed – or that the evidence was so weak that the arrest and charge should never have happened. The point is, from such statistics we can not make a reasonable conclusion as to cause without more detailed data.

I’ll just make up a scenario. I am not aware of any such study but let’s just pretend it exists. The made-up study says that on average, the life span of people without a high school diploma is shorter than those with a high school diploma. Would any reasonable person conclude that life span is tied to a simple piece of paper? No, there is no reasonable causal connection. People without a high school diploma may generally be poorer, may not have as much access to medical care, and may end up in more dangerous jobs. Those would all be reasonable causal factors for a reduced life span, having a high school diploma would not be.

We see this confusion (accidental or intentional) between causation and correlation in the news constantly. Reporters rarely question it. There are “elevated health risks” because there is a firearm in the home. Well – there are also elevated health risks if you live in certain areas of Chicago. Is the risk tied to geography? If the firearm is in the house next door does that reduce the risk? This is not causation, it is a correlation. As a veteran of countless root cause analysis exercises, I can assure you that if you don’t know the difference between causation and correlation, you will almost certainly never arrive at the root cause.

You have to ask the right question. What do you expect the odds are that you will arrive at the right answer if you ask the wrong question?

The reporting throughout the last year on COVID is a good example. We’ve all witnessed the changes coming from health experts and the media. The simple answer is “they don’t know.” Trouble is, they won’t admit it.

You don’t often hear of Bonini’s paradox. Simply stated, when modeling a complex system, the model can never be completely accurate. If the model were to achieve complete accuracy, it would, in fact, become reality. A year ago, when the folks at the University of Washington were publishing their forecasts of the death toll from COVID, they had almost no data. The complexities of how it is transmitted could not have been factored accurately because they simply didn’t exist (and probably still do not). According to Bonini’s paradox, the UW model had to be wrong – it was a virtual certainty it was wrong. The only question was how wrong.

Statistical analysis is a powerful and useful tool. Strong statistical correlations should not be ignored. But without knowing the actual mechanics of the causal link, they are just that – correlations.

Progressives are attaching too much relevance to the paper noted earlier. They are arguing for reduced prosecutions AND defunding the police AND disarming law-abiding citizens. They are doing all of this based on bad data. Does any reasonable person expect that the confluence of those three issues, if progressives get their way, will result in anything positive?

Is there room for improvement in our criminal justice system? Yes, I think there is. I think there is a lot to the argument that police officers should not be sent out to enforce silly laws at (if necessary) gunpoint. If some scum bag is selling untaxed cigarettes on the street corner, I don’t think that warrants a gunfight, and I think police and prosecutors ought to be allowed to determine what is and is not worth pursuing without fear of losing their jobs.

But the difference between correlation and causation is still there, and confusing the two drives a significant amount of bad information – in the eyes of this author.



The Paper

Leave a Reply