Normalize sample to match the mean and the standard deviation

Question

There are two samples (sufficiently large and independent). One, size $n_1$ has the mean $m_1$ and the standard deviation $s_1$, the other, size $n_2$ with the mean $m_2$ and the standard deviation $s_2$. Is there a procedure to normalize the second sample so that it has the mean and standard deviation of the first?

BruceET · Answer

Consider the following two samples, from R:
set.seed(2020)
x1 = rnorm(20, 100, 15)
m1 = mean(x1);  m1;   s1 = sd(x1); s1
[1] 98.46448
[1] 21.39371

x2 = rnorm(30, 50, 10)
m2 = mean(x2);  m2;   s2 = sd(x2); s2
[1] 52.77616
[1] 8.347496

Step 1: Standardize x2
z = (x2 - m2)/s2
mz = mean(z); mz;  sz = sd(z); sz
[1] 1.494302e-16   # essentially 0
[1] 1

Step 2: Rescale z2 (called y2)  to match sample mean and SD of x1.
y2 = s1*z + m1
mean(y2);  sd(y2)
[1] 98.46448  ## compare 98.46448 above
[1] 21.39371  ## compare 21.39371

Stripchart (bottom to top) of original x1 and x2 and y2 (original
x2 rescaled to match sample mean and SD of x1.
stripchart(list(x1, x2, y2), ylim = c(.7,3.3), pch="|", 
           group.names=c("x1","x2","y2"))
  abline(v=mean(y2), col="green2")  # means of `x1` and `y2`.

Notes: (1) If x1 and x2 are rounded to only a few places, and
then y2 is similarly rounded, then the mean and SD of y2 will
typically not match those of x1 exactly. Minor adjustments may help.
(2) I am aware that the procedure shown above can be 'collapsed' into one
more complicated step, but I find the two-step method shown easier to remember.
(3) When OPs on this site give only means and variances (not the whole dataset)
it is sometimes useful to use something like this to contrive a dataset
to use in R that is very similar to OP's. By contrast to R, some
procedures in other software (e.g, Minitab) will perform various tests
based only on summary statistics.

Normalize sample to match the mean and the standard deviation

One Answer

Add your own answers!

Ask a Question