I can pretty much step you through what most first-time forecasters
do, and then explain where you'll sooner-or-later end up: Using a
logarithmic scale.
Bad Method 1: Using a Linear Scale
Crappier than it seems |
This works, right? Sort of, but it has two big drawbacks:
Static Level of Precision
You probably don't want all of your error buckets to be
the same size; this gets ridiculous pretty quickly.
I mean, reporting upon a 100-110% bucket makes sense -- but what about the 170-180% bucket (are these sufficiently different from 180-190% to warrant their own bucket?). And it makes zero sense to report a 780%-790% bucket or (even more ridiculous) a 8,920%-8,930% bucket. And that's what you're gonna get with this approach.
I mean, reporting upon a 100-110% bucket makes sense -- but what about the 170-180% bucket (are these sufficiently different from 180-190% to warrant their own bucket?). And it makes zero sense to report a 780%-790% bucket or (even more ridiculous) a 8,920%-8,930% bucket. And that's what you're gonna get with this approach.
Ideally, your buckets will grow along with your error: You need fine precision when your forecasts are close, and coarser precision for your big misses -- which you don't get.
Asymmetrically Bounded
On a percentage chart like the one above, the smallest error bucket is fixed: 0-10%. Your forecast can't get any lower than 0% of actual. However, your forecast could be a bazillion times higher than actual.
As a result, when you plot out your error, you'll get a very short "low-tail", and a very long "high-tail". Among other problems, this makes it impossible to visually scan for forecast bias, because your too-high and too-low forecasts aren't reported in a consistent fashion.
Bad Method 2: Using a
Custom Scale
From there, most people move on to creating customized
buckets. So you'll write statements like this:
If err = 0 then
"Match"
else if err > 0
and err <= 1% then "1%"
else if err > 1%
and err <= 5% then "1-5%"
else if err > 5%
and err < 10% then "5-10%"
else if err >
10% and err < 25% then "10-25%"
...But you'll probably discover that your buckets are always
somewhat arbitrary and not quite pleasing. And, of course, any time
you want more or less-granular buckets, you gotta mess with your equation.
Proper Method: Use a
Log Scale
Reporting upon forecasting error on a logarithmic scale
solves all of these problems.
But before I jump into using log-based error
groups, let me remind you about logarithms. (For the rare few of you who
are already savvy with logs, feel free to jump ahead.)
Easy, once you get the hang of it. |
Logs are much easier to understand if you just see them in action. The chart to the right shows the log values for certain bases. As you can see, for log base 5, each value represents five times the previous value. For log base 2, each value represents two times the previous value.
To
convert your forecasting error to a log scale, just take the log of the
forecast/actuals. Any base is fine. With base 2, if your forecast
were double your actual, log(200%,2)= 1. If your
forecast were half of your actual, log(50%,2)= -1. If your forecast
matched your actual, log(100%,2)=0.
What
does this buy us? A lot!
Dynamic Level of Precision
First and foremost, each log value represents a bigger
group, which can go from very constrained to enormous.
Let's say that your forecast is often off by 30%, but
you're occasionally off by upwards of 100,000%. (This especially happens
when your actual value unexpectedly drops near zero, where the forecast/actual
ratio can be sky-high even for modest forecasts.)
On a linear scale, you can't meaningfully show
30% and 100,000% in the same chart, without the chart being enormous. Yet
on a log base 2 chart, the 30% gets a value of .37, and the 100,000% gets a
value of 13.2.
Amazingly, a log scale allows for fine-tuned precision
for accurate forecasts (i.e, you can see the difference between a 5% and a 10%
miss), even when reported right next to a 100,000% miss!
Symmetrically Bounded
As I described earlier, on a linear scale, too little
precision for forecast that are very low (they're all lumped into the 0-10%
bucket), and too much precision for forecasts that are too high (each one gets
its own useless bucket, like 8050-8060%).
Logarithms don't have that problem: They represent high
and low values on an equal terms. If you were ever 100x too high, using a
log base 2, you'd get a log value of 6.6. If you were 100x too low,
you'd get a log value of -6.6.
Working Example:
Here, I used Excel to generate 1000 random numbers (each
from 1 to 100), and then "forecast" each with another set of
semi-random numbers -- based partly upon the original number, and partly upon
another random number.
For kicks, I gave my forecast a subtle positive bias --
it's higher than the actual number a bit too often.
Now, let's plot the error both linearly and
logarithmically, and see what we find.
Ugh, what a disaster.
First, for one thousand data points, we have 629
buckets.
Second, ten "low" buckets (0-10%, 10-20%, etc)
averaged 28 entries each, while six hundred and nineteen "high"
buckets averaged just one entry each.
Third, because the chart is naturally lopsided, we can't
visually see the overt bias in the forecast.
Logarithmic Scale
This is more like it!
First, our chart has only 100 points, instead of six
hundred and twenty nine. Yet our precision was precise where it mattered
(i.e., forecasts close to 100% of actuals), and rough when it didn't matter.
Second, you can see that both our high and low values
slope away, roughly in symmetry. Instead of our low forecast crammed into
ten groups, and our high values spread out over 619 groups, they're much closer
to equal.
Third, you can see instantly that there is a positive
forecast bias. (Focus on the center bar, which represents forecasts near
100% of actuals. Now compare the bars immediately to its right and left,
and then bars two to the right and left. See how the right
bars are always higher? That's a tell-tale sign of positive forecast
bias.)
Conclusion
I probably could have written this entire article in a
single sentence: When reporting forecast
error, use a logarithmic scale.
The trouble is, other people had told me
that in the past, and I never quite understood why I was doing
it, or even knew if I was doing it right. Hopefully, by describing the
drawbacks of the more conventional linear approach, you'll have a bit better
understanding of why you should embrace logarithmic error reporting sooner,
rather than later!