Cross Validated Asked on November 18, 2021
Consider tests of randomness of bit sequences of fixed size $n$ bits as in cryptography (e.g. NIST Special Publication 800-22 page 1-4; notice this uses captital-$P$ for p-values). Define such test as any deterministic function $T$ that accepts a vector $V$ of $n$ bits, and outputs a p-value $p$ in $]0dots1]$ (with null hypothesis that $V$ consists of random independent unbiased bits), that is with the defining property
$$forallalphain[0dots1],;;Prbig(T(V)lealphabig),le,alpha$$
where the probability is computed with $V$ a vector of random independent unbiased bits (or equivalently, is computed as the proportion of $V$ such that $T(V)lealpha$ among the $2^n$ vectors $V$).
Example tests matching this definition are
There’s a natural partial order relation among tests: $T$ implies $T’$ when $forall V, T(V)le T'(V)$. Any test implies True. Balanced implies Non-stuck, but does not imply Non-zero. Some tests, including Balanced and Non-zero, are optimal in the sense that no other test implies them.
Section 2 of the above reference describes 15 tests for large $n$ (thousands bits), that are intended to catch some defects relevant to actual random number generators, and be near-optimal (in the above sense). For example, section 2.1 is an approximation of Balanced for large $n$ using the complementary error function, designated The Frequency (Monobit) Test.
Q1: Assume that all the $n$ bits in a vector $V$ tested are random independent bits having same odds $q={1over2}+epsilon$ to be set, with $epsilon$ unknown (besides being smallish), corresponding to Shannon entropy per bit $$E=-qlog_2(q)-(1-q)log_2(1-q)=1-{2overlog2}epsilon^2+mathcal O(epsilon^4)$$
The Balanced test for some (large) number $n$ of such bits is applied once, and outputs a small p-value (say $ple0.001$). That allows us to reject the null hypothesis $H_0$ that $E=1$ (equivalently, $epsilon=0$) with high confidence (corresponding to the p-value $p$).
What is a tight function $E(p,n)$ such that we can reject the hypothesis $H_1$ that $Ege E(p,n)$ with some good confidence (corresponding to some known p-value higher than $p$, perhaps $2p$, or $sqrt p$, or something on that tune)? By “tight function” I mean that the lowest $E(p,n)$ we manage to prove for some confidence, the better.
Q2: Things are as in Q1, except that the test is unspecified beyond the defining property of p-values. Can we reject the hypothesis $H_1$ that $Ege E(p,n)$ with good confidence, for whatever $E(p,n)$ and at least the confidence level that was established in Q1? If that conjecture was false, what’s a counterexample, or/and is that reparable?
Q3: Things are as in Q2 (or in Q1 if the property thought in Q2 does not apply), except that the bits in the input $V$ might be dependent, but still with Shannon entropy per bit $E$; that is, the distribution of the inputs $V$ is such that $$nE;=;-sum_{Vtext{ with }Pr(V)ne0}Pr(V)log_2(Pr(V))$$
Can we reject the hypothesis $H_1$ that $Ege E(p,n)$ with good confidence, for whatever $E(p,n)$ and at least the confidence level that was established in Q1? If that conjecture was false, what’s a counterexample, or/and is that reparable?
If I understand your problem correctly, the choice would be multinomial logistic regression, also known as the "conditional maximum entropy model". Concerning your second question I guess that two different concerns can be emphasized: the accuracy of the confidence interval or the statistical power of the inference. I would especially be concerned with correcting for multiple testing, for example using the Holm-Bonferroni correction of the alpha values or something more contemporary.
For the final question, a method that takes prior probabilities into account would be a Bayesian method. Although I don't have a concrete suggestion here, the problem reminds me of random forests and naive Bayes classifiers. However, these models not only involve dependence, but training, so I am not sure if those topics would be relevant to the problem at hand.
Answered by noumenal on November 18, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP