Cross Validated Asked by user291972 on November 14, 2021
I am trying to design an email test to measure the demand lift obtained from a marketing promotion (treatment) versus no promotion (control). To do so, I want to calculate the per-group sample size required to get a significant read on the difference in average demand per-customer for different marketing segments.
To do so, I am applying the following formula (for each segment):
$$
N = frac{2(Z_{1-alpha/2}+Z_{pi})^2sigma^2}{Delta^2}
$$
Where:
$Z_{1-alpha/2}$ = percentile of the normal distribution used as the critical value in a two-tailed test (1.96)
$Z_{pi}$ = percentile of the normal distribution where $pi$ is the power of the test (0.84 for 80th percentile)
$sigma$ = within-group standard deviation
$Delta$ = expected mean difference between the treatment versus control population
To calculate the standard deviation and expected mean difference above, I pulled historical response for the same period last year during which the test will run. My question is this: should the group means and standard deviations be estimated from the total population which was exposed to the treatment (and control), respectively, or should the mean and standard deviation be calculated based on respondents only? Put another way, should I use the mean/variance for the full audience exposed to a given treatment in the past, or the mean/variance for responders only, and then back solve for required full audience?
The results that I’m getting appear counter-intuitive, with similar required sample sizes among the most-engaged and least-engaged audiences, so I know I must be doing this wrong.
Most of the material that I’ve come across from the marketing community involves using a desired difference in response rate to solve for appropriate per-group sample sizes. In my case, however, the metric of interest is demand-based rather than raw response (average demand per customer). That said, the response rate is an important metric, as it is particularly low for certain groups of customers, but it does not directly reflect the metric of interest.
Thanks in advance!
Here is a simulation to show that your approximate formula for sample size $n$ gives a reasonable answer for a particular case, which may be realistic.
Suppose $sigma^2/Delta^2 = 9,$ significance level is 5% and desired power is 80%. Then the formula gives $n approx 141.$ [An exact formula would use a noncentral t distribution, but with $n > 100,$ the approximate formula should be OK.]
n = 2*(1.96+.84)^2*9; n
[1] 141.12
Now suppose I do $m = 100,000$ two-sided pooled two-sample t tests using samples of size $n = 150$ to try to detect a significant difference (5% level) in sample means from populations $mathsf{Norm}(mu_1 = 100, 15)$ and $mathsf{Norm}(mu_2 = 105, 15),$ so that $Delta = 5, sigma= 15$ and $sigma^2/Delta^2 = (15/5)^2 = 9.$ [For the population means, only $Delta=|mu_1-mu_2| = 5$ matters.]
Then I should reject at the 5% level a little more than 80% of the time. The simulation shows rejection 82% of the time, so the simulation is in substantial agreement with your formula.
set.seed(2020)
pv = replicate(10^5, t.test(rnorm(150,100,15),
rnorm(150,105,15),var.eq=T)$p.val)
mean(pv <= .05)
[1] 0.82189
Answered by BruceET on November 14, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP