Computer Science Asked on December 4, 2021
Consider $n$ sets, $X_i$, each having $n$ elements or fewer, drawn among a set of at most $m gt n$ elements. In other words
$$forall i in [1 ldots n],~|X_i| le n~wedge~left|bigcup_{i=1}^n X_iright| le m$$
Consider the complete graph $G$ formed by taking every $X_i$ as a node, and weighing every edge $(i,j)$ by the cardinal of the symmetric difference $X_i triangle X_j$.
An immediate bound on the weight of the minimal spanning tree is $mathcal{O}(n^2)$, since each edge is at most $2 n$, but can we refine this to $mathcal{O}(m)$?
For illustration, consider $2 p$ sets, $p$ of which contain the integers between $1$ and $p$ and $p$ of which contain the integers of between $p+1$ and $2p$. A minimal spanning tree has weight $p$ but a poorly chose tree on this graph would have weight $(p-1)p$. Intuitively, if there are only $m$ values to chose from, the sets can’t all be that different from one another.
EDIT:
Contributor Dmitry gives a nice counterexample below in which $m$ is nearly but not quite $n^2$.
A counterexample or proof would still be of interest in the case where $m = mathcal{O}(k n)$. Can the weight of spanning-tree be bound by $mathcal{O}(f(k) n)$? By $mathcal{O}(f(k) n log^c n)$?
Interesting question.
The right intuition should probably be along the guideline that two random subsets of cardinality $n$ drawn from some $cn$ elements for some constant $c$ differ from each other significantly with a probability very close to 1 and, hence, the weight of the minimum spanning tree of the graph $G$ should be $mathcalTheta(n^2)$ on average. I cannot prove that guideline is correct, however.
Instead, I will present one series of such examples. More specifically, from some $n$ (that can be arbitrary large), there are $n$ sets, each having $(n-1)/2$ elements drawn from a set of $n$ elements, such that the cardinality of the symmetric difference between any two sets is no less than $(n-1)/2$. So the weight of the minimum spanning tree is no less than $(n-1)^2/2=mathcalTheta(n^2)$.
Here is the construction, using quadratic residue.
Example. Let $n=p$ be an odd prime. Let $X_0$ be the set of all non-zero quadratic residues of $p$ between 0 and $p-1$ inclusive. In other words, $$X_0={0le klt p: left(frac {k}pright)=1}$$ where $left(frac{cdot}pright)$ is the Legendre symbol. For $0le ilt p$, let $X_i$ be "$X_0$ plus $i$", i.e., $$X_i={0le klt p: left(frac {k-i}pright)=1}.$$ Then $|X_i|=frac{p-1}2$ for all $i$ and $|X_i triangle X_j|ge frac{p-1}2$ for all $inot=j$.
Proof: Since $left(frac{cdot}pright)$ is either $-1$, $0$, or $1$, we have $1+left(frac{cdot}pright)ge0$. Hence, $$begin{aligned} &quadquad sum_{0le klt p}left(1+left(frac {k-i}pright)right)left(1+left(frac {k-j}pright)right)\ &gesum_{0le klt p,land,left(frac {k-i}pright)=1,land, left(frac {k-j}pright)=1}left(1+left(frac {k-i}pright)right)left(1+left(frac {k-j}pright)right)\ &=sum_{0le klt p,land,left(frac {k-i}pright)=1,land, left(frac {k-j}pright)=1}4\ &=4,|X_icap X_j| end{aligned}$$ On the other hand, we have $$begin{aligned} &quadquad sum_{0le klt p}left(1+left(frac {k-i}pright)right)left(1+left(frac {k-j}pright)right)\ &=sum_{0le klt p}left(1 + left(frac {k-i}pright) + left(frac {k-j}pright)+ left(frac {k-i}pright)left(frac {k-j}pright)right)\ &=p + 0 + 0 + sum_{0le klt p} frac {k^2-(i+j)k+ij}p\ &=p-1 end{aligned}$$ Since $pnmid(-(i+j))^2-4ij=(i-j)^2$, the last equality above comes from the case of $pnmid b^2-4ac$, theorem 1 in the paper On Certain Sums with Quadratic Expressions Involving the Legendre Symbol. So we have $|X_icap X_j|le frac{p-1}4.$
Since $|X_i|=|X_j|=frac{p-1}2$, $ |X_i triangle X_j|=|X_i|+|X_j|-2|X_icap X_j|ge frac{p-1}2.$ $quadcheckmark$
For people who appreciate concrete examples, here are the sets when $n=17$, where $|X_i triangle X_j|ge 8$. $$begin{aligned} X_{0}&={phantom{1}1, phantom{1}2, phantom{1}4, phantom{1}8, phantom{1}9, 13, 15, 16 }\ X_{1}&={phantom{1}2, phantom{1}3, phantom{1}5, phantom{1}9, 10, 14, 16, phantom{1}0 }\ X_{2}&={phantom{1}3, phantom{1}4, phantom{1}6, 10, 11, 15, phantom{1}0, phantom{1}1 }\ X_{3}&={phantom{1}4, phantom{1}5, phantom{1}7, 11, 12, 16, phantom{1}1, phantom{1}2 }\ X_{4}&={phantom{1}5, phantom{1}6, phantom{1}8, 12, 13, phantom{1}0, phantom{1}2, phantom{1}3 }\ X_{5}&={phantom{1}6, phantom{1}7, phantom{1}9, 13, 14, phantom{1}1, phantom{1}3, phantom{1}4 }\ X_{6}&={phantom{1}7, phantom{1}8, 10, 14, 15, phantom{1}2, phantom{1}4, phantom{1}5 }\ X_{7}&={phantom{1}8, phantom{1}9, 11, 15, 16, phantom{1}3, phantom{1}5, phantom{1}6 }\ X_{8}&={phantom{1}9, 10, 12, 16, phantom{1}0, phantom{1}4, phantom{1}6, phantom{1}7 }\ X_{9}&={10, 11, 13, phantom{1}0, phantom{1}1, phantom{1}5, phantom{1}7, phantom{1}8 }\ X_{10}&={11, 12, 14, phantom{1}1, phantom{1}2, phantom{1}6, phantom{1}8, phantom{1}9 }\ X_{11}&={12, 13, 15, phantom{1}2, phantom{1}3, phantom{1}7, phantom{1}9, 10 }\ X_{12}&={13, 14, 16, phantom{1}3, phantom{1}4, phantom{1}8, 10, 11 }\ X_{13}&={14, 15, phantom{1}0, phantom{1}4, phantom{1}5, phantom{1}9, 11, 12 }\ X_{14}&={15, 16, phantom{1}1, phantom{1}5, phantom{1}6, 10, 12, 13 }\ X_{15}&={16, phantom{1}0, phantom{1}2, phantom{1}6, phantom{1}7, 11, 13, 14 }\ X_{16}&={phantom{1}0, phantom{1}1, phantom{1}3, phantom{1}7, phantom{1}8, 12, 14, 15 }\ end{aligned}$$
Answered by John L. on December 4, 2021
You can't. Consider the following sets for some $k$, with $m=k^2$ (they both are powers of $2$):
Each symmetric difference is at least $frac k2$. Each level has $frac mk$ sets, and there are $1 + log frac mk$ levels. Therefore, there are $frac mk(1 + log frac mk)$ sets. Since each set must have cardinality at most the number of sets, we must have $k le frac mk (1 + log frac mk)$, and it's satisfied when $m = k^2$.
The size of the minimum spanning tree is at least $frac k 2 cdot frac mk (1 + log frac mk) = Omega(m log m)$.
Answered by user114966 on December 4, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP