When exploring data using multivariate analysis, we typically treat ratings as interval scales. And by that I mean that a rating of a “5” equals a value of 5, a rating of a “4” equals a value of 4, etc. We also assume the difference between any two ratings are equal (e.g. the difference between 5 and a 4 is the same as the difference between a 4 and a 3). This assumption is useful, for example, in both creating and applying a regression equation.

Consequently, data modelers are partial to using mean scores since our models typically involve the entire rating scale as inputs. However, our clients often prefer to measure ratings in terms of the percent of respondents who gave the highest rating (Top Box) or the percent of respondents who gave the two highest ratings (Top 2 Box). In order to bridge the difference between models built around mean ratings and the need for results focused on top box and top two box scores, we researched how best to link these two concepts.

We evaluated numerous studies, but focused on one particular study that contains two question batteries covering 21 separate items. Each item uses a 6-point anchored scale where 6 means “Agree Completely” and 1 means “Disagree completely.” Unlike many satisfaction studies, the mean scores are distributed over a wide range. The lowest mean score is 2.07, and the highest is 4.75.

As you can see above, the chart plots raw scores for Top Box, Top 2 Box, and Top 3 Box for each of the 21 rating questions. Of interest from a modeling standpoint is that it looks possible to fit three curved lines through the 3 data point series. Modeling these curved lines requires some observations and some simplifying assumptions. The first observation is that rating frequencies seem to roughly take on the shape of a Gamma distribution. If an item that uses the 6-point scale as described above had a mean of 4.5 and a variance of 2.0, it would have frequencies similar to those shown on the bar chart below. The corresponding Gamma distribution is superimposed on top of the bars.

The second assumption involves how the variance changes over the possible range of a mean score. At a mean of 1.00 (or 6.00 at the other extreme), all respondents gave a rating of 1 (or 6). So at both ends of the rating scale, the variance necessarily equals 0. The highest potential variance is found in the middle of the range. We tried several curves to model the variance, including a third order polynomial. Eventually, we settled on a sine wave as a simple and reasonable approximation.

This gives us everything we need as inputs to estimate top box scores (and Top 2 box or Top 3 box) over the range of possible mean scores. The chart below shows how closely the predicted values (the trend lines) match the actual ratings received (the data points). The calculations themselves are based on the cumulative density function of the gamma distribution, which can be calculated in Excel without any add-ons or the use of a statistical package. The identification of a sound statistical model for predicting Top Box (and Top 2 Box and Top 3 Box) using only mean scores opens up new opportunities for modeling ratings data with our impact analysis tool and presenting the results to clients using their preferred method of summarizing ratings.