Mathematica Asked on April 6, 2021
I attempted to classify an increasingly ordered list into several classes to make frequency distribution table. I am faced with a difficulty to nest two pure functions: one for TakeWhile
second argument and the other one for mapping with /@
. To which pure function does the #
belong? It is confusing!
TakeWhile[data, #[[1]] <= # <= #[[2]]] & /@ class
Then I attempted to use Function
to mitigate the ambiguity as follows.
TakeWhile[data, Function[u, #[[1]] <= u <= #[[2]]]] & /@ class
Unfortunately it produces unexpected result as follows.
ClearAll[data, class]
data = RandomInteger[100, 20] // Sort
class = Table[{10 i, 10 i + 9}, {i, 0, 9}]
TakeWhile[data, Function[u, #[[1]] <= u <= #[[2]]]] & /@ class
outputs
{0, 5, 5, 10, 19, 22, 23, 24, 25, 33, 34, 40, 40, 42, 53, 62, 69, 74, 91, 91}
{{0, 9}, {10, 19}, {20, 29}, {30, 39}, {40, 49}, {50, 59}, {60, 69}, {70, 79}, {80, 89}, {90, 99}}
{{0, 5, 5}, {}, {}, {}, {}, {}, {}, {}, {}, {}}
In order to make the existing answers usable to my real scenario, let me explain a bit about the class
interval. The class given above is just a simplification of my real scenario.
Consider the following code. l
is a list of the lower bound of each class interval.
ClearAll[data, n, r, k, w]
data = {
26, 22, 44, 60, 55, 58, 45, 42, 41, 44, 39, 55, 57, 52, 59, 46,
54, 56, 22, 58, 54, 34, 69, 33, 61, 20, 62, 29, 24, 53, 51, 23,
20, 38, 34, 52, 36, 52, 52, 43, 30, 51, 49, 45, 39, 42, 32, 29,
34, 47, 34, 35, 21, 54, 52, 51, 38, 57, 58, 53, 55, 44, 27, 29,
52, 22, 34, 56, 45, 53, 18, 46, 53, 51, 63, 57, 56, 28, 22, 17,
49, 21, 58, 61, 51, 28, 35, 42, 24, 55, 19, 34, 62, 30, 35, 32,
57, 47, 20, 36
} // Sort;
n = Length@data;
r = Last@data - First@data;
k = 1 + 3.322 Log[10, n] // Ceiling;
w = r/k // Ceiling;
l = Table[First@data + w*i , {i, 0, k - 1}]
data = {0, 5, 5, 10, 19, 22, 23, 24, 25, 33, 34, 40, 40, 42, 53, 62, 69, 74, 91, 91};
GatherBy[data, Quotient[#, 10] &]
{{0, 5, 5}, {10, 19}, {22, 23, 24, 25}, {33, 34}, {40, 40, 42}, {53}, {62, 69}, {74}, {91, 91}}
Split[data, SameQ @@ Quotient[{##}, 10] &]
{{0, 5, 5}, {10, 19}, {22, 23, 24, 25}, {33, 34}, {40, 40, 42}, {53}, {62, 69}, {74}, {91, 91}}
Values @ GroupBy[Quotient[#, 10] &] @ data
{{0, 5, 5}, {10, 19}, {22, 23, 24, 25}, {33, 34}, {40, 40, 42}, {53}, {62, 69}, {74}, {91, 91}}
DeleteCases[{}] @ BinLists[data, 10]
{{0, 5, 5}, {10, 19}, {22, 23, 24, 25}, {33, 34}, {40, 40, 42}, {53}, {62, 69}, {74}, {91, 91}}
Or use the second column of you class
as bin limits:
binlims = {Join[{-Infinity}, class[[All, 2]], {Infinity}]};
DeleteCases[{}] @ BinLists[data, binlims]
{{0, 5, 5}, {10, 19}, {22, 23, 24, 25}, {33, 34}, {40, 40, 42}, {53}, {62, 69}, {74}, {91, 91}}
Correct answer by kglr on April 6, 2021
A slot #
always binds itself to the nearest &
outside. In your first attempt:
TakeWhile[data, #[[1]] <= # <= #[[2]]] & /@ class
the three slots in #[[1]] <= # <= #[[2]]
simply belong to the only &
and are thus mapped to class
.
What is desired? We want that #[[1]]
and #[[2]]
belong to the function used on class
, and that the middle #
belongs to the 2nd argument of TakeWhile
. One may hence suggest
TakeWhile[data, #[[1]] <= #1 <= #[[2]] &] & /@ class
This seems right. But don't forget #
is equivalent to #1
. They all bind themselves to the nearest &
, which is the argument of TakeWhile
.
So the solution (for the #
-&
problem) is as the second attempt of OP:
TakeWhile[data, Function[u, #[[1]] <= u <= #[[2]]]] & /@ class
but it still doesn't work. Now the problem is the logic. We know that TakeWhile
scans from the first element of the list. By this, we're giving every class
the same list, data
. As a result, e.g., for{10, 19}
in class
, since the first element 0
of data
is not within this range, it immediately halts and returns {}
, as can be seen in the output.
As a solution, we should take elements (that we want) out, and drop[pass] the rest (that we don't yet need) to the next list of class
. This can be done by TakeDrop
, and the whole iteration can be done by FoldPairList
:
FoldPairList[
TakeDrop[
#1, LengthWhile[#1, u [Function] Between[u, #2]]
(* ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
or `Between[#2] `, the operator form, shorter *)
] &, data, class]
{{0, 5, 5}, {10, 19}, {22, 23, 24, 25}, {33, 34}, {40, 40, 42}, {53}, {62, 69}, {74}, {}, {91, 91}}
Of course, often the wheel has been invented:
BinLists[data, {Table[10 i, {i, 0, 10}]}]
{{0, 5, 5}, {10, 19}, {22, 23, 24, 25}, {33, 34}, {40, 40, 42}, {53}, {62, 69}, {74}, {}, {91, 91}}
We can get class
es from lower bounds l
:
class = Partition[Flatten@{l, Infinity}, 2, 1]
{{17, 24}, {24, 31}, {31, 38}, {38, 45}, {45, 52}, {52, 59}, {59, 66}, {66, [Infinity]}}
and use the TakeWhile
method. Or, simply
BinLists[data, {Flatten@{l, Infinity}}]
{{17, 18, 19, 20, 20, 20, 21, 21, 22, 22, 22, 22, 23}, {24, 24, 26, 27, 28, 28, 29, 29, 29, 30, 30}, {32, 32, 33, 34, 34, 34, 34, 34, 34, 35, 35, 35, 36, 36}, {38, 38, 39, 39, 41, 42, 42, 42, 43, 44, 44, 44}, {45, 45, 45, 46, 46, 47, 47, 49, 49, 51, 51, 51, 51, 51}, {52, 52, 52, 52, 52, 52, 53, 53, 53, 53, 54, 54, 54, 55, 55, 55, 55, 56, 56, 56, 57, 57, 57, 57, 58, 58, 58, 58}, {59, 60, 61, 61, 62, 62, 63}, {69}}
Answered by SneezeFor16Min on April 6, 2021
Just a quick benchmark of kglr's methods in version 10.1.
f1[data_] := GatherBy[data, Quotient[#, 10] &];
f2[data_] := Split[data, SameQ @@ Quotient[{##}, 10] &];
f3[data_] := Values@GroupBy[Quotient[#, 10] &]@data;
f4[data_] := DeleteCases[{}]@BinLists[data, 10];
f5[data_] := SplitBy[data, Quotient[#, 10] &];
Needs["GeneralUtilities`"]
BenchmarkPlot[{f1, f2, f3, f4, f5}, Sort@RandomInteger[5 #, #] &,
"IncludeFits" -> True]
Answered by Mr.Wizard on April 6, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP