Cross Validated Asked on November 6, 2021
I have a sequence of $N$ binary values (+
and -
). At $t=0$, all values are minuses (-
)
t=0: ----------------------------
Over time, segments of various sizes of this sequence get set to +
. I am then, talking of a "painted segment". Note that never a value get set back to -
(impossible to remove the paint). The dynamic might look like:
t=0: ----------------------------
t=1: ----------------------------
t=2: ----------------------------
t=3: ----------------------------
t=4: -------------------++++-----
t=5: -------------------++++-----
t=6: --+++--------------++++-----
t=7: --+++--------------++++-----
t=8: --+++--------------++++-----
t=9: --+++--------------++++-----
t=10: --+++-++-----------++++-----
t=11: --+++-++-----++++++++++-----
t=12: --+++-++-----++++++++++-----
t=13: --+++-++-----++++++++++-----
t=14: --+++-++-----++++++++++-----
t=15: --+++-++-----++++++++++-----
t=16: --+++-++-----++++++++++--+++
t=17: --+++-++-----++++++++++--+++
t=18: --++++++-----++++++++++--+++
t=19: --++++++-----++++++++++--+++
t=20: --++++++-----+++++++++++++++
t=21: --++++++-----+++++++++++++++
t=22: --+++++++++--+++++++++++++++
t=23: --+++++++++--+++++++++++++++
t=24: +-+++++++++--+++++++++++++++
I would like to figure whether the painted segments happen at random among the sequences "unpainted" or whether they tend to be clustered or whether they tend to be spread. In short, my question is "Are segments painted randomly respective to previously painted segments?"
Note that the concept of repainting an already painted segment is meaningless.
Here is an example where the painted segments are clustered
t=0: ----------------------------
t=1: ----------------------------
t=2: -------------------++++-----
t=3: -------------------++++-----
t=4: ---------------++++++++-----
t=5: ---------------++++++++-----
t=6: ----------+++--++++++++-----
t=7: ----------+++++++++++++-----
t=8: ----------+++++++++++++-----
t=9: ----------+++++++++++++-----
t=10: ----------++++++++++++++++--
t=11: ------++--++++++++++++++++--
t=12: ------++--++++++++++++++++--
t=13: ------++++++++++++++++++++--
t=14: ------++++++++++++++++++++++
and here is an example, where the painted segments are spread
t=0: ----------------------------
t=1: ----------------------------
t=2: -------------------++++-----
t=3: -------------------++++-----
t=4: ++-----------------++++-----
t=5: ++-----------------++++-----
t=6: ++------+++--------++++-----
t=7: ++------+++--------++++-----
t=8: ++------+++--------++++-----
t=9: ++------+++--------++++---++
t=10: ++------+++--------++++---++
t=11: ++------+++--------++++---++
t=12: ++------+++-+++----++++---++
t=13: ++------+++-+++----++++---++
t=14: ++------+++-+++----++++---++
t=15: ++-+++--+++-+++----++++---++
t=16: ++-+++--+++-+++----++++---++
t=17: ++-+++--+++-+++--++++++---++
How can I investigate that? What null distribution(s) I could use for hypothesis(ese) testing? Do I need data over time or is a single screenshot at $t=tau$ enough?
It does not matter much but for my specific case, $N$ is of the order of 100 millions and the largest $t$ value if of the order of 1000. I do not necessarily have access to $t$ values where the entire sequence is set to +
.
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP