Johnie Walker Blue: Great Scotch? Or Great Marketing?

1. The Set-up

Johnnie Walker Blue Label retails (at least in Seattle) for more than $300 a bottle. It comes in an impressive presentation box, with a heavy blue-glass bottle, carefully designed to look far more impressive than the 750ml it really is. It is very much a luxury good, suitable for the most refined of tastes. But, if you read the label, there are some worrisome signs. For instance, it doesn’t state an age. Under Scottish law, anything labeled “Scotch Whisky” must be aged at least three years. If an age is stated on the label, it must be the age of the youngest whisky used in the blend. Johnnie Walker Black Label (~$50) is labeled 12 years; Green Label (~$75) 15 years; and Gold Label (~$100) 18 years. All of these are significantly cheaper than Blue Label, so you’d expect Blue to at least be able to drink in the US (21 years). But, alas, there is no age at all. Blue may not even be quite out of diapers yet. (Red Label, their cheapest, similarly does not state an age.)

Also, because Blue is labeled “Blended Scotch Whisky” it could also contain grain whisky–meaning a cheaper spirit made from grains other than malted barley. Whiskies labeled “single malt” and “blended malt” (such as Johnnie Walker Green, for instance) are made from 100% malted barley, and are generally considered to be of higher quality.

Three hundred dollars is a lot of money. I can get some very impressive bottles for a third of that, for instance, Edradour “Caledonia” 12 year, a non-chill filtered single malt ($100), or even just a bottle of Johnny Walker Green Label, a blended malt ($75). The open question is whether, for $300, am I getting a Scotch that is truly three times as good? This calls for a (rigorous) test

2. The Methodology

The test was done in conjunction with a friend. He helped to procure three Scotchen1, a bottle each of:

  • Johnny Walker Green Label (15 year blended malt, ~$75/bottle)
  • Edradour “Caledonia” (12 Year, non-chill filtered single malt, ~$100/bottle)
  • Johnny Walker Blue Label (No age stated, blended whisky, ~$300/bottle)

Together, we carefully selected 18 discerning individuals to blind sample the three Scotches. Each person got a 1oz pour of each Scotch, labelled only with an “A”, “B” or “C.” They were told that this was a tasting to determine a best all-round Scotch; they were told nothing about the price or the background of the actual experiment. In addition to the three glasses of Scotch, the subjects were given water, both to cleanse the palette and to dilute the Scotch as desired; they were also provided with crackers to nibble on. The subjects were to rank the Scotchii2 in order, from favorite to least favorite

Once all the subjects had a chance to taste and rank the Scotchii, the results were tabulated. An in-depth statistical analysis was conducted to understand the results.

Note: Certain individuals were asked not to dilute the Scotchen. A non-chill filtered Scotch will often turn cloudy with water, and this would have given away details of the samples to a more knowledgeable subject. Also, due to modern human subject protection rules, we were unable to use electroshock for training purposes, much to the dismay of at least one subject.

3. The Results

First, the raw results, looking only at the top-ranked Scotch here.

Green Label Edradour Blue Label
4 9 5
n = 18    

As can be seen, the Edradour was the most popular, followed by the Blue Label, and finally the Green Label. But are these significant differences? I will take two approaches to answering this question. The first approach is to assume that all of the Scotches are the same. That is, I expect that people would pick each one more or less at random, or six votes for each of the Scotches. So, are there statistically significant differences in these rankings?


Using an exact multinomial test, we find no significant differences in the rankings (p = .419). (Similarly, there is no difference using a chi-square test, x2 = 2.33, df = 2, p = .31; again, not significant.) This means that our baseline assumption that the Scotches are equivalent is validated.

But, there is a huge variation in the prices of the Scitch3: from $75 to $300. This suggests that I should revise the assumption that there is no difference between the Scotches: at $300, I expect Blue Label to be significantly better. As a rough approximation, I’ll use the cost to reflect the expected distribution of the first-place rankings.

  Green Label Edradour Blue Label
Observed 4 9 5
Expected 16% (3) 21% (4) 63% (11)
n = 18      

Then question is, with a price-based weighting, is there a significant difference in the first-place rankings?

Yes, but…

There is a significant difference between the observed and expected rankings when I do a cost-weighted comparison. (Exact multinomial p = .0004; Chi-square, x2 = 11.19, df = 2, p = .003). However, what’s interesting here is that the significance is being driven by two things, first, Edradour does significantly better than expected (Chi-square residual = 2.68), and Johnny Walker Blue does significantly worse (Chi-square residual = -1.88)

4. Conclusions

TL;DR: Johnny Walker Blue Label is great marketing,…

… but not a great Scotch.

Overall, Johnny Walker Blue is a disappointment: using an assumption of equivalence, it doesn’t even beat its cheaper relative, Johnny Walker Green. On a price-adjusted basis, it fails against both the Green Label and Edradour “Caledonia.” In fact, the Edradour scores very well: perhaps the non-chill filtered is making a difference. In any case, in this tasting, it ends up being the overall favorite. Thanks to PSD for spearheading this.

  1. Surprisingly, certain British sources say that “Scotchen” is a valid plural for Scotch. This is a relic of a Germanic / Viking invasion from around 1100. 

  2. American sources say that “Scotchii” is correct. Maybe the melting pot of America, where the Scots and the Greeks mixed freely caused this unusual plural. 

  3. Australian English uses “Scitch” (Like mouse / mice) to refer to more than one Scotch. This is probably an example of Foster’s Rule.