Mathematica Asked by user6546 on February 11, 2021
Have a Dataset
which is derived from the output from another program.
Have written some functions to retrieve and format this data.
Can make this work as intended by using Table
to apply the function to each row of the Dataset
.
But cannot achieve the same result when attempting to use some of the built in capabilities of Dataset
.
Can someone point me in the right direction?
Below is the statement that works with Table
but doesn’t work with the alternate syntax.
Both lines are intended to apply the function dsGetValueList
to each row of dsApples
.
dsAllAppleParamValues = Table[dsGetValueList[dsAllApples[i], dsApplesAllParams], {i, 1, Length@dsAllApples}]; dsAllAppleParamValues2 = dsAllApples[All, dsGetValueList[#, dsApplesAllParams] &] // Normal;
The structure of the Dataset
might be non-standard, but it is derived from another program and that can’t be changed. Further background: the source file is a JSON file and that can be Import
-ed with the option "RawJSON"
to obtain a Dataset
.
Code for a test case below. In summary the code changes data like this:
to this:
(*sample data*)
item01 = <| "name" -> "item01", "class" -> "apples" ,
"params" -> {<| "name" -> "TYPE", "value" -> "fuji"|>
, <| "name" -> "WEIGHT", "value" -> "0.5"|>
, <| "name" -> "COLOR", "value" -> "red"|>
}|>
item02 = <| "name" -> "item02", "class" -> "apples" ,
"params" -> {<| "name" -> "TYPE", "value" -> "gala"|>
, <| "name" -> "COLOR", "value" -> "red"|>
, <| "name" -> "EXP_DATE", "value" -> "10/10/20"|>
, <| "name" -> "WEIGHT", "value" -> "1.5"|>
}|>;
item03 = <| "name" -> "item03", "class" -> "apples" ,
"params" -> {<| "name" -> "TYPE", "value" -> "granny"|>
, <| "name" -> "COLOR", "value" -> "green"|>
}|>;
item04 = <| "name" -> "item04", "class" -> "oranges" ,
"params" -> {<| "name" -> "TYPE", "value" -> "navwl"|>
, <| "name" -> "WEIGHT", "value" -> "3.5"|>
, <| "name" -> "EXP_DATE", "value" -> "09/10/20"|>
}|>;
item05 = <| "name" -> "item05", "class" -> "oranges" ,
"params" -> {<| "name" -> "TYPE", "value" -> "seville"|>
, <| "name" -> "WEIGHT", "value" -> "1.5"|>
, <| "name" -> "EXP_DATE", "value" -> "09/10/20"|>
}|>;
dsAll = Dataset[{item01, item02, item03, item04, item05}];
(*useful functions*)
dsGetName[ds_] := ds["name"]
dsGetValue[ds_, pName_] := Module[{paramDS, valueList},
paramDS = ds["params"] ;
valueList = Normal@paramDS[Select[#name == pName &] , "value"];
If[Length[valueList] > 0, First[valueList], "-"]
]
dsGetValueList[ds_, pList_List] :=
Module[{}, dsGetValue[ds, #] & /@ pList]
(*retrieve metadata about apples: their names and parameters*)
dsAllApples = dsAll[Select[#class == "apples" &] ]
dsAllAppleNames = dsAllApples[All, dsGetName] // Normal;
dsApplesAllParams =
dsAllApples[All, "params", All, "name"] // Normal // Flatten //
Union;
(*retrieve parameter values for each apple, there may be missing values*)
(**-- the first statement works as intended*)
(* -- second statement does not*)
dsAllAppleParamValues =
Table[dsGetValueList[dsAllApples[i], dsApplesAllParams], {i, 1,
Length@dsAllApples}];
dsAllAppleParamValues2 =
dsAllApples[All, dsGetValueList[#, dsApplesAllParams] &] // Normal;
Equal[dsAllAppleParamValues2, dsAllAppleParamValues]
(*format results*)
r1 = Prepend[Transpose[dsAllAppleParamValues], dsAllAppleNames] //
Transpose ;
TableForm[r1,
TableHeadings -> {None, Prepend[dsApplesAllParams, "Name"]}]
This is quite a bit awkward, but perhaps you can use this as a starting point:
dsApples = dsAll[Select[#class === "apples" &], {"name", "params"}];
tmp = Join[dsApples[All, Key["name"] /* <|"Name" -> Identity|>],
Dataset[KeyUnion[(Apply[AssociationThread] @* Transpose) /@
Normal[dsApples[All, Lookup["params"] /* Values]],
Missing[] &]], 2];
tmp[All, {"Name", "COLOR", "EXP_DATE", "TYPE", "WEIGHT"}]
I'll leave the reformatting to a TableForm[]
object up to you.
Answered by J. M.'s ennui on February 11, 2021
The difference between your two approaches is that in the first version, extracting parts of a dataset returns the part wrapped in Dataset
while using the second approach, the part is not wrapped in Dataset
. So, you can just add the Dataset
wrapper yourself with:
dsAllAppleParamValues2 = dsAllApples[
All,
dsGetValueList[Dataset@#, dsApplesAllParams]&
] //Normal;
dsAllAppleParamValues == dsAllAppleParamValues2
True
That being said, the version without the Dataset
head is probably easier to work with, so I would modify your dsGetValueList function to work with non-Dataset
objects (in this case, just an Association
).
Answered by Carl Woll on February 11, 2021
Here is a way that generates the columns in the order that they occur in the original dataset:
dsAll[
Select[#class==="apples"&] /* KeyUnion
, <| "Name" -> #name, #name -> #value& /@ #params |>&
]
If the exact order of the columns is important, an additional re-ordering stage can be added:
dsAll[
Select[#class==="apples"&] /* KeyUnion
, <| "Name" -> #name, #name -> #value& /@ #params |>&
][All, {"Name", "COLOR", "EXP_DATE", "TYPE", "WEIGHT"}]
Answered by WReach on February 11, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP