# Comparing percentages based on likert scale by year

Cross Validated Asked by Chris Beeley on November 26, 2020

I’m reviewing an analysis someone else has done on some Likert scale data. They’ve assigned each point on the scale 1-5 (1 = bad, 2 = poor etc.), found the average score in each area, and then converted to a percentage (by multiplying by 20) to give a percentage of total score (100% being the best, 20% being the worst).

I’m okay with this, but then they’re computed a significance test as if the percentages were actual percentages, like if they’d gone out and asked people “Do you own your own home? Yes/ no”. They’ve used a method similar to the one described here:

https://www.dummies.com/education/math/statistics/how-to-compare-two-population-proportions/

I want to tell them that this is a completely invalid way of analysing the data, and they’ve ignored the variance in the scores by collapsing everything into a percentage. I feel they should use ordinary t-tests on the data to determine significant difference. But I’m doubting myself. Any thoughts appreciated.

This should help you understand better. I have chosen Paired test =False.

> #create random numbers between 1-5
> x = round(runif(10, 1, 5), 0)
> x
[1] 3 3 2 4 1 1 3 4 4 4
> y = round(runif(10, 1, 5), 0)
> y
[1] 2 2 3 4 5 2 1 3 5 4
>
> #Perform T-test
> t.test(x,y, paired = FALSE, conf.level = 0.95)

Welch Two Sample t-test

data:  x and y
t = -0.34757, df = 17.681, p-value = 0.7323
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.410481  1.010481
sample estimates:
mean of x mean of y
2.9       3.1

>
> #multiply X20 to scale it between 1 and 100 [Does not convert to %]
> x1 = x*20
> x1
[1] 60 60 40 80 20 20 60 80 80 80
> y1 = y*20
> y1
[1]  40  40  60  80 100  40  20  60 100  80
>
> #perform t-test on new data
> t.test(x1,y1, paired = FALSE, conf.level = 0.95)

Welch Two Sample t-test

data:  x1 and y1
t = -0.34757, df = 17.681, p-value = 0.7323
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-28.20962  20.20962
sample estimates:
mean of x mean of y
58        62

>
> #convert to percentage
> x2 = x/5
> x2
[1] 0.6 0.6 0.4 0.8 0.2 0.2 0.6 0.8 0.8 0.8
> y2 = y/5
>
> #perform t-test on new data
> t.test(x2,y2, paired = FALSE, conf.level = 0.95)

Welch Two Sample t-test

data:  x2 and y2
t = -0.34757, df = 17.681, p-value = 0.7323
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-0.2820962  0.2020962
sample estimates:
mean of x mean of y
0.58      0.62


Irrespective of the scale you end up with the same conclusion. So, Technically it should not affect it.

Answered by Not_Dave on November 26, 2020