Cross Validated Asked by afternoon on November 29, 2021
Suppose that I have info about a sample, and
In one University, we have 70% females in the population and 30% males. In another University, the numbers are interchanged and 30% are females and 70% are males. Now assume that a random sample of 100 students are picked from each university (total number of observations: 200).
What is the probability that a sample of this size would be able to reject the null hypothesis that the proportion of females in the first population is greater than the second population at an alpha level of 0.05?
How do you find probability of sample and say that it rejects null hypothesis?
Suppose I take Success to mean Female. Then the number of Females in a random sample from University A is $X sim mathsf{Binom}(n=100,p=0.7)$ and the number of Females in a random sample from University B is $Y sim mathsf{Binom}(n=100,p=0.3).$
Try one test. Let's try using prop.test
in R to analyze one such experiment with 200 students:
set.seed(2020)
x = rbinom(1, 100, .7); y = rbinom(1, 100, .3)
x; y
[1] 68
[1] 32
prop.test(c(x,y),c(100,100), cor=F)
2-sample test for equality of proportions
without continuity correction
data: c(x, y) out of c(100, 100)
X-squared = 25.92, df = 1, p-value = 3.559e-07
alternative hypothesis: two.sided
95 percent confidence interval:
0.2307018 0.4892982
sample estimates:
prop 1 prop 2
0.68 0.32
So in this particular experiment, the test finds very strong evidence
to reject $H_0: p_a = p_b$ with P-value very near $0.$ [Use of a continuity
correction is not useful for samples of size 100, so I used parameter
cor=F
in prop.test
to disallow continuity correction.]
Then the question is whether I somehow got an outrageously atypical
pair of samples in the example above, or whether prop.test
really
does have good power to detect the large difference in the proportions
of Female students at the two universities, based on samples of $n_a = n_b = 100$ from each university.
Simulate 100,000 tests to estimate power. By doing the experiment 100,000 times, I can closely estimate the power of this test. [Computations in R.]
set.seed(722)
pv = replicate(10^5, prop.test(c(rbinom(1,100,.3),
rbinom(1,100,.7)), c(100,100),cor=F)$p.val)
mean(pv <= .05)
[1] 0.99996
The answer is that the power of the test to detect the difference in proportions (at the 5% level) is above 99%. So it would be extremely rare for such an experiment not to show a difference in proportions. Specifically, the answer is 'a probability of almost 1'.
There are several versions of this test (depending on whether a normal approximation is involved, whether a continuity correction is used, and whether the test uses a 'pooled' standard error (under the null hypothesis that proportions are equal). Not knowing the version of the test you will use, I can't give an algebraic solution. (Also, this is a 'self-study' problem and you have not shown what you have tried, so I have no way to guess what approach you might be planning/expected to use.)
Lower bound on power. Here is one possible approach that does not use simulation:
If we have $X=60, Y=40,$ then prop.test
rejects, so it will
also reject for more extreme differences such as $X=61, Y=39,$
and so on. [You might use your favorite test here instead of R's implementation of prop.test
.]
prop.test(c(40,60), c(100,100), cor=F)$p.val
[1] 0.004677735
However the exact binomial probability of $P(X ge 60, Y le 40) = P(X ge 50)P(Y le 40) = 0.9875.$ So that gives a pretty good idea that rejection is nearly certain.
pbinom(40, 100, .3)*(1-pbinom(40, 100, .7))
[1] 0.9875016
The plot below shows that PDFs of $mathsf{Binom}(100, 0.3)$ and $mathsf{Binom}(100, 0.7)$ hardly overlap.
x = 0:100; pdf.x = dbinom(x, 100, .7)
y = 0:100; pdf.y = dbinom(y, 100, .3)
hdr="PDFs of BINOM(100,.3) [left] and BINOM(100,.7)"
plot(x-.1, pdf.x, type="h", col="blue", lwd=2,
ylab="PDF", xlab="Nr of Females", main=hdr)
points(y+.1, pdf.y, type="h", col="brown", lwd=2)
Addendum, per Comment: The answer is about 12% power.
set.seed(723)
pv = replicate(10^5, prop.test(c(rbinom(1,100,.48),
rbinom(1,100,.53)), c(100,100),cor=F)$p.val)
mean(pv <= .05)
[1] 0.11845
Answered by BruceET on November 29, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP