There are two samples (sufficiently large and independent). One, size $n_1$ has the mean $m_1$ and the standard deviation $s_1$, the other, size $n_2$ with the mean $m_2$ and the standard deviation $s_2$. Is there a procedure to normalize the second sample so that it has the mean and standard deviation of the first?

Consider the following two samples, from R:

```
set.seed(2020)
x1 = rnorm(20, 100, 15)
m1 = mean(x1); m1; s1 = sd(x1); s1
[1] 98.46448
[1] 21.39371
x2 = rnorm(30, 50, 10)
m2 = mean(x2); m2; s2 = sd(x2); s2
[1] 52.77616
[1] 8.347496
```

Step 1: Standardize `x2`

```
z = (x2 - m2)/s2
mz = mean(z); mz; sz = sd(z); sz
[1] 1.494302e-16 # essentially 0
[1] 1
```

Step 2: Rescale `z2`

(called `y2`

) to match sample mean and SD of `x1`

.

```
y2 = s1*z + m1
mean(y2); sd(y2)
[1] 98.46448 ## compare 98.46448 above
[1] 21.39371 ## compare 21.39371
```

Stripchart (bottom to top) of original `x1`

and `x2`

and `y2`

(original
`x2`

rescaled to match sample mean and SD of `x1`

.

```
stripchart(list(x1, x2, y2), ylim = c(.7,3.3), pch="|",
group.names=c("x1","x2","y2"))
abline(v=mean(y2), col="green2") # means of `x1` and `y2`.
```

*Notes:* (1) If `x1`

and `x2`

are rounded to only a few places, and
then `y2`

is similarly rounded, then the mean and SD of `y2`

will
typically not match those of `x1`

*exactly.* Minor adjustments may help.

(2) I am aware that the procedure shown above can be 'collapsed' into one more complicated step, but I find the two-step method shown easier to remember.

(3) When OPs on this site give only means and variances (not the whole dataset) it is sometimes useful to use something like this to contrive a dataset to use in R that is very similar to OP's. By contrast to R, some procedures in other software (e.g, Minitab) will perform various tests based only on summary statistics.

Answered by BruceET on November 22, 2020

