Mathematics Asked on December 15, 2021
I have a black-box discrete probability distribution $mathbf{p}$ (I can only get samples from it), two items in the support: $sigma_1, sigma_2$, and a procedure which estimates the following quantity:
$$
r = frac{mathbf{p}(sigma_1)}{mathbf{p}(sigma_1)+mathbf{p}(sigma_2)}
$$
The procedure runs for $T$ iterations ($T$ is not known in advance).
At each iteration $t$, I get a multiset $S_t$ of $n_t$ independent samples from $mathbf{p}$ conditioned on the set ${sigma_1, sigma_2}$. Then, my estimate for $r$ is:
$$
hat{r}_t = frac{1}{n_t} cdot sum_{s in S_t} [s = sigma_1]
$$
(where $[P]$ is the Iverson bracket notation).
Think of this procedure as estimating the bias of a coin, with ${sigma_1, sigma_2} = {H, T}$.
Can I optimize the procedure by reusing the samples from the previous iteration when computing the estimate $hat{r}_{t+1}$? That is, instead of drawing $n_{t+1}$ samples at the $(t+1)$-th iteration, can I just draw the difference $n_{t+1} – n_t$ and reuse the $n_t$ samples from the $t$-th iteration? Note that $n_1 < n_2 < dots < n_T$.
Since samples are independent at each iteration, I suppose this is fine.
And if this is the case, my next question is: how to formally prove that reusing samples is fine and does not affect the estimate?
Clarifications
The whole discussion stemmed from the following goal, that is, I want:
$$
Pr bigg[ hat{r} in big[r (1-epsilon), r (1+epsilon) big] bigg] geq 1-delta
$$
Thus, I have derived a lower bound on the number of samples needed, using a multiplicative Chernoff bound, which yielded:
$$
n = Theta left( frac{1}{(epsilon cdot hat{r})^2} cdot log frac1delta right)
$$
so the estimation is part of the bound. To this extent, we need to make reasonable guesses: for the $t$-th iteration, we guess that $r geq 1/2^t$, and set $hat{r} = 1/2^t$ in the bound for the number of samples, $n_t$. Then, after sampling $n_t$ values, if we get that the estimate is indeed $geq 1/2^t$, we stop; otherwise $t gets t + 1$ and we continue.
It’s worth noting that, assuming $r sim mathsf{Uniform}(0,1)$, we have that $T sim mathsf{Geometric}(1/2)$.
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP