Correlation !=, ==, ?? Causation


One of the first things people working in data science / machine learning have to grapple with is that our claims are predictive, not inferential. Instead of making careful, logical moves from one clear, proven claim to another, we mix together a set of features, build a model, and then present our results as a reasonably-probable forecast of the future.

Thoughtful people, however, know that “correlation does not equal causation.” Just because two things can be shown to occur in some sort of reliable sequence does not mean that Thing A necessarily caused Thing B. When I leave for work, I notice that the sun is often rising. But unlike the rooster, I can’t really take credit for that.

