Computer Science Asked by Gizmo on February 10, 2021
Given a directed acyclic graph $G$ and a start vertex $s$ and an end vertex $e$, consider a coloring of the edges valid if, for every path from $s$ to $e$ and every color $c$, either $c$ is never encountered along that path, or every edge that is colored $c$ is visited by that path.
Given $G,s,e$, I would like to find a valid coloring that uses the minimal number of colors. Is there an efficient algorithm for this problem?
I show below an example graph and a sample solution. The circle on the left is the starting vertex, the filled circle on the right is the end vertex.
This answer is an improvement for my (already accepted) original answer, which describes an exact, but potentially very slow algorithm. This improvement was inspired by the @pcpthm answer, however I don't employ any randomization here, so this algorithm also produces the exact coloring.
For each arc $a in A$ let's consider a set of arcs $R(a) subset A$, reachable from the arc $a$ in both forward and backward directions. Any arc $a in A$ is reachable from itself, so $a in R(a)$. We can color a pair of arcs $(a_1,a_2)$ by the same color, if and only if $R(a_1) = R(a_2)$. So, as in my original answer, we can define an equivalence relation on the set $A$:
$$(a_1 sim a_2) equiv (R(a_1) = R(a_2))$$
A minimum number of colors, needed to color all the arcs in the graph $G$ according to your restriction, will be equal to the number of equivalence classes for the relation above.
We can compute sets $R(a)$ for each arc $a in A$, using the iterative process, similar to one, described in the @pcpthm answer. For each vertex $u in V$ we define two sets of arcs - a set $F(u) subset A$ of arcs, forward-reachable from the vertex $u$, and a set $B(u) subset A$ of arcs, backward-reachable from the vertex $u$. For any arc $a=(u,v)$ we can represent its $R(a)$ as a union of three non-intersecting sets:
$$R(a) = B(u) cup {(u,v)} cup F(v)$$
Sets $F(u)$ and $B(u)$ for each vertex $u in V$ can be computed using topological ordering on the set $V$. For example, the backward (according to this ordering) scan of the set $V$ will give us all the sets $F(u)$ using the formula:
$$F(u) = bigcup_{v in N_{out}(u)}({(u,v)} cup F(v))$$
where $N_{out}(u)$ - open "out"-neighborhood of the vertex $u$. Each such set can be represented by a binary number of the length $m=|A|$. The union operation here for large $m$ can be performed in $O(m)$ time, so the total time to compute all the $R(a)$ will be $O((n+m)m)$.
This algorithm is slower than the randomized algorithm in the @pcpthm answer, however it looks like it's the price we need to pay for the exact solution. The culprit is a necessity to work with large sets of arcs, which can't be represented by a single machine word.
Answered by HEKTO on February 10, 2021
There is a simple randomized linear-time algorithm (one-sided error). It is based on HEKTO's idea, using the equivalent relation.
The algorithm chooses weight $w_a$ for each arc $a$. Then, the algorithm computes the weighted sum of paths $W(a) = sum_{p in P(a)} prod_{a' in p} w_{a'}$ for each each arc $a$. All $W$ values can be computed by using two dynamic programming (combining "forward" DP and "backward" DP) and in $Theta(n + m)$ arithmetic operations. The algorithm then assign one color for each $W$ value, using a hash map.
Pseudo code:
forward = [ 1 if it is the source, 0 otherwise | vertices ]
for each arc a in topological order:
forward[a.to] += forward[a.from] * w[a]
backward = [ 1 if it is the destination, 0 otherwise | vertices ]
for each arc a in reverse topological order:
backward[a.from] += w[a] * backward[a.to]
for each arc a:
W[a] = forward[a.from] * w[a] * backward[a.to]
It is easy to see $W(a) = W(b)$ if and only if $P(a) = P(b)$ if weights $w$ are treated as formal variables. According to Schwartz–Zippel lemma, if we choose weight randomly from a finite field $F$, then one particular equality fails in at most probability $m/|F|$. The overall success probability of the algorithm can be bounded by $1 - m^3 / 2|F|$ because we have at most $m choose 2$ equations we want to distinguish, but it should be more like $approx 1 - m^2/|F|$ for "typical input" (though I'm not very sure). We can choose a large prime of size $p approx 2^{64}$ and do a modular arithmetic $F = GF(p)$ to implement the algorithm.
Answered by pcpthm on February 10, 2021
I present a refinement on HEKTO's algorithm that I think works and should be more efficient: it runs in $O^*(min(n^3,m^2))$ time.
Let $P(a)$ denote the set of paths that start at $s$, go through the arc $a$, and end at $e$.
Lemma 1. $a_1,a_2$ can be given the same color iff $P(a_1)=P(a_2)$.
Let $G^*$ be the dual graph of $G$, i.e., each arc of $G$ is a vertex of $G^*$, and for each pair of arcs $u to v$, $v to w$ in $G$ we connect the corresponding vertices with a directed arc in $G^*$. The start vertex of $G^*$ is a new vertex $s_0$, and it has an arc in $G^*$ to each arc $s to v$ in $G$; and similarly for its end vertex.
Lemma 2. An arc $a_2$ is in every path of $P(a_1)$ iff $a_2$ is a dominator or post-dominator of $a_1$ in $G^*$.
Say that $a_1 prec a_2$ if $a_1$ is the immediate dominator of $a_2$ and $a_2$ is the immediate post-dominator of $a_1$ in $G^*$.
Lemma 3. $P(a)=P(a')$ iff there exists a sequence of arcs $a_1,dots,a_n$ such that $a=a_1 prec a_2 prec cdots prec a_n=a'$.
This theory immediately leads to an efficient algorithm for your problem:
Compute the dominator tree $D$ and post-dominator tree $D'$ of $G^*$.
Initialize a Union-Find data structure with each arc of $G$ in its own set.
For arc $a_1$ of $G$, let $a_2$ be its immediate dominator in $D$; if $a_1$ is the immediate dominator of $a_2$ in $D'$, call Union($a_1,a_2$).
Assign a different color to each set of the Union-Find data structure.
If $G$ has $n$ vertices and $m$ arcs, then $G^*$ has $m$ vertices and $min(n^3,m^2)$ arcs. Computing the dominator tree can be done in nearly linear time (see e.g., https://en.wikipedia.org/wiki/Dominator_(graph_theory)#Algorithms or Dominator Tree for DAG). The Union-Find algorithm can be done in nearly linear time. Thus, the running time is essentially $O(min(n^3,m^2))$, ignoring logarithmic factors.
I wouldn't be surprised if there is a more efficient way to compute the dominator tree of $G^*$ without constructing $G^*$ explicitly, which would lead to improvements in the running time of this algorithm.
Proof of Lemma 1. If $P(a_1) ne P(a_2)$, there is some path that goes through $a_1$ but not $a_2$ (or vice versa), and then by the requirements, $a_1,a_2$ cannot be given the same color.
For the converse, suppose we form equivalence classes on the arcs where $a_1,a_2$ are equivalent if $P(a_1)=P(a_2)$, give each equivalence class a unique color, and color each edge according to the color of the equivalence class it is contained in. Then this satisfies all of the constraints: for any color $c$ and any two arcs $a_1,a_2$ colored $c$, we have $P(a_1)=P(a_2)$, so any path $p in P(a_1)$ also satisfies $p in P(a_2)$ and thus visits $a_2$; and any path $p notin P(a_1)$ also satisfies $p notin P(a_2)$ and thus does not visit $a_2$.
I haven't written out the proofs of Lemmas 2-3, so I recommend you do that and check my reasoning before using this algorithm.
Answered by D.W. on February 10, 2021
You can color a pair of arcs $(a_1,a_2)$ by the same color, if and only if all the paths from the source to the sink, passing through the arc $a_1$, also pass through the arc $a_2$.
Let's consider the set $P$ of all paths from the source to the sink in the graph $G=(V,A)$. Let's denote the subset $P(a) subset P$ of all paths, passing through the arc $a$. Then we can define an equivalence relation on the set $A$:
$$(a_1 sim a_2) equiv (P(a_1) = P(a_2))$$
A minimum number of colors, needed to color all the arcs in the graph $G$ according to your restriction, will be equal to the number of equivalence classes for the relation above.
The algorithm to split all the arcs into such equivalence classes is exact, but may be slow for large graphs. It consists of two steps:
Step 1. For each arc $a in A$ compute the subset $P(a) subset P$. This can be done by scanning all the paths in the set $P$, and updating all the subsets $P(a)$ along each such path.
Step 2. Let's assume that we store all the subsets $P(a)$ as binary numbers. Sort the set $A$ by these numbers - this will allow us to group together all the arcs with the same subset of paths. Scan this sorted set of arcs, assigning the same color to arcs in each group.
Answered by HEKTO on February 10, 2021
It sounds to me like the greedy algorithm should work, I'm not able to come up with any counter-examples, however, I haven't had time to try to prove the claim either.
Terminology
Definition. Let $s$ be the start and $t$ be the end vertices (source and sink, respectively). Let $a$ and $b$ be vertices where there is a path from $a$ to $b$ (from now on $a < b$), i.e. $s leq a < b leq t$. We say that an $st$-path $P = left(s=v_1, dots, v_ell=tright)$ is $(a,b)$-non-laminar if $a in P$ and $b notin P$, or $a notin P$ and $b in P$. Intuitively, this means that $P$ "branches" out between $a$ and $b$, and branches in after $b$, or branches out before $a$ and branches in between $a$ and $b$.
Let us define $text{lca}(v)$ to be the vertex that is the lowest common ancestor of the vertices ${u in V mid (u,v) in E}$, i.e. the lowest common ancestor of the in-neighbors of $v$.
We say that a path is $v$-non-laminar if it is $(text{lca}(v), v)$-non-laminar.
Greedy algorithm.
(1) If a vertex has in-degree = 1 and out-degree = 1, then you use the in-arc's color for the out-arc.
(2) Every time you fan out, that is, whenever you have a vertex with out-degree at least two, each out-arc needs a new color.
(3a) Every time you fan in, i.e., there is a vertex $v$ with in-degree at least two, and there is no $v$-non-laminar path, you take the color of an in-arc of $text{lca}(v)$.
(3b) Every time you fan in and there is a $v$-non-laminar path, you need a new color.
That should cover all possible cases, and I think it shouldn't be too difficult to show that you can do it in $O(n^2)$ time. It might be possible to lower the time to $O(n + m)$, but I can't think of it now.
Answered by Pål GD on February 10, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP