DESCRIPTIVE VS. PREDICTIVE STATISTICS
What is the difference between descriptive and predictive statistics?
Stephen G. Barone
barodine marketing communications & research
Imagine you’re the president of a credit-card company. You survey your customers and among the questions you ask is whether they have checked their own credit score in the last year. Further, imagine that 30 percent of the respondents said they did.
In descriptive statistics you are simply reporting that 30 percent of the respondents checked their credit scores. You are not making any inferences or predictions—but simply describing the situation. This might be useful in any number of ways. For example, knowing that 30 percent are doing so, you might decide to make it easier for people to check their credit on your website and advertise it as a feature and benefit.
Predictive or Inferential Statistics
In predictive (or inferential) statistics, however, you would be interested in knowing if your customers who checked their credit scores are different in some way from those who did not. For example, do the 30 percent who checked their credit within the last year have better or worse credit than those who did not?
To assay this, you might look at the credit scores of the 30 percent and compare them to the rest of the respondents by using any number of statistical treatments, e.g. chi-square, r-correlation, F-test, ANOVA, etc. Then:
- If you found that people who checked their credit scores as a group had lower credit ratings than the rest of the sample, then checking one’s credit would be a negative predictor.
- Contrarily, if you found that people who checked their credit scores as a group had higher credit ratings than the rest of the sample, then checking one’s credit would be a positive predictor.
Thus, the size and representativeness of the sample relative to the population is much more crucial in predictive statistics than for descriptive statistics. Why?
Saying that 30 percent of the sample checked their credit is indisputable. But using that datum to predict that an individual is more or less likely to default on a loan if she did so in the last year is much more aggressive and consequential.
So what’s the upshot?
Predictive (or inferential) statistics demand larger samples and higher response rates than do descriptive statistics. The sample must be representative of the targeted population about whom you want to make inferences. And your survey needs to demonstrate adequate validity, reliability, and internal consistency.
After all, saying that a ball team won seven of its last ten games is one thing. Predicting that they will win the 11th based on that datum is quite another!
Stephen G. Barone is a marketing communications specialist and co-principal at barodine marketing communications & research, a general contractor of creative and analytical marketing talent to the science, technology, engineering, medical, professional, and general business communities.
Please feel free to duplicate and link to this article, with the following notice:
© 2014 barodine marketing/communications/design