2-dimensional ranking of multiple arrays

Question

Given a fixed dimension $d$, say $d=2$ we want the following:
Input: $A_1ldots A_m$: $m$ arrays of length $n$ of integers
Each input array $A_i$ must be a permutation of the numbers $1..n$, so in each array each number from $1$ to $n$ appears exactly once.
Output: For each tuple (pairs in the case $d=2$; triplets in the case of $d=3$ etc. In this example we'll use pairs) of numbers $(1,1),(1,2)dots(n,n)$, we want a count for in how many input arrays the first number of the tuple is also the first to appear in the array (among the numbers of that tuple). The order in which the other numbers of the tuple appear in an array doesn't matter, as long as they come later than the first number of the tuple.

Question: Can this be done quicker than $O(mn^d)$ in the worst case?

Upper and lower bounds
The output is represented as a $d$-dimensional array of length $n$. Therefore a lower bound for the runtime complexity is $O(n^d)$.
The naive approach is to create $m$ mappings from each number to its index for each input array. Then for all $n^d$ tuples, walk through the $m$ mappings, yielding a runtime complexity upper bound of $O(dmn^d)$ and since $d$ is a constant this is $O(mn^d)$.
Examples
A = (1,2,3,4),        Output =  1 2 3 4
    (1,2,3,4),                  -------
    (1,2,3,4),     =>       1 | 4 4 4 4
    (1,2,3,4)               2 | 0 4 4 4
                            3 | 0 0 4 4
    d=2, m=4, n=4           4 | 0 0 0 4

=======================================

A = (4,3,2,1),         Output = 1 2 3 4
    (1,2,3,4),                  -------
    (1,2,3,4)      =>       1 | 3 2 2 2
                            2 | 1 3 2 2
    d=2, m=3, n=4           3 | 1 1 3 2
                            4 | 1 1 1 3

Application
While writing poker analysis software, I'm particularly interested in the case $d=3, mapprox 1250, napprox 1250$. I estimate that the naive $O(mn^d)$ solution takes multiple hours but less than a day when using native Java arrays (no hashmaps etc) on a single thread.
$d$ stands for the number of players that are still active during a poker hand. Normal poker software handles the case $d=2$. Some high-end software handles the case $d=3$.
I'm interested in the case $d=2$, but then the naive approach is already quick enough in most situations. I'm mainly interested in the case $d=3$. I'm less (but still) interested in the case $d=4$ which is probably unfeasible and even less interested in greater values. I'm not interested in $d>10$. A poker table has 10 players max. The values of $m$ and $n$ do not increase/decrease with $d$.

Optidad · Answer

Here is a solution quite simple to implement, that only improves the $d$ factor.
For the increasing series $[1, 2, .., n]$, get the tuple list of the coordinates to increase in your output array. For $d = 2$ and $n = 4$, this is $H=[(1, 1), (1, 2), (1, 3), (1, 4), (2, 2), (2, 3), (2, 4), (3, 3), (3, 4), (4, 4)]$. Asymptotically (not exactly due to $(a, a)$ being always valid), there are $O(frac{n^d}{d})$ terms in this list.
Initialize an output array $B$ of size $n^d$ to zeros.
Then, for each of the $m$ arrays $A$ and for each tuple $(a_0, a_1, ..., a_d)in H$, increase $B[A[a_0], A[a_1], ..., A[a_d]]]$ by 1.
It achieves $O(frac{m n^d}{d})$.
EDIT
For $d=3$, basically, both tuples $(0, 1, 2)$ and $(0, 2, 1)$ are in the coordinates list to increase. But if you want to use the knowledge that $B[a, b, c] = B[a, c, b]$, you have to keep only tuples non-decreasing after the 2nd integer. So here you leave $(0, 2, 1)$. Be careful to still keep $(a, b, b)$ tuples like $(0, 2, 2)$.
Then you will divide the number of terms in the list by about a factor $2(d-1)!$ for the general case).
At the very end, you have to add a $O(n^d)$ step to gather counts on $B[a, b, c]$ and $B[a, c, b]$. For every tuple $(a, b, c)$ with $b < c$,

$S = B[a, b, c] + B[a, c, b]$
$B[a, b, c] = S$
$B[a, c, b] = S$

This runs in $O(frac{m n^d}{d!})$.

2-dimensional ranking of multiple arrays

One Answer

Add your own answers!

Ask a Question