Mathematica Asked by Stephan on December 1, 2020
I’d like to transform this dataset:
Dataset[{
<|"Name" -> "Steven","Born" -> 1980, "Year" -> 2017, "Score" -> 115|>,
<|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>,
<|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2017, "Score" -> 70|>,
<|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2018, "Score" -> 300|>
}]
into this dataset:
Dataset[{
<|"Name" -> "Steven", "Born" -> 1980, "Score_2017" -> 115, "Score_2018" -> 230|>,
<|"Name" -> "Joe", "Born" -> 1981, "Score_2017" -> 70, "Score_2018" -> 300|>
}]
In R’s tidyr this operation is called pivot_wider
.
Any pointers?
I know that it requires a GroupBy["Name"]
This is the simplest way I could think off:
ds = Dataset[{
<|"Name" -> "Steven","Born" -> 1980, "Year" -> 2017, "Score" -> 115|>,
<|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>,
<|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2017, "Score" -> 70|>,
<|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2018, "Score" -> 300|>
}];
ds[
GroupBy[#Name &],
ResourceFunction["MergeByKey"][{{"Name", "Born"} -> First}]
][
Values,
Append[KeyDrop[#, {"Year", "Score"}],
AssociationThread["Score_" <> ToString[#]& /@ #Year, #Score]
]&
]
If different people have different years, you might want to use KeyUnion
to make the dataset square again. Documentation for MergeByKey
is here
Correct answer by Sjoerd Smit on December 1, 2020
You can use the package "DataReshape.m" as shown below. (It is in my ToDo list to submit the corresponding functions to Wolfram Function Repository very soon.)
Load the package:
Import["https://raw.githubusercontent.com/antononcube/MathematicaForPrediction/master/DataReshape.m"]
Here is the original dataset:
dsData = Dataset[{<|"Name" -> "Steven", "Born" -> 1980,
"Year" -> 2017, "Score" -> 115|>, <|"Name" -> "Steven",
"Born" -> 1980, "Year" -> 2018, "Score" -> 230|>, <|
"Name" -> "Joe", "Born" -> 1981, "Year" -> 2017,
"Score" -> 70|>, <|"Name" -> "Joe", "Born" -> 1981,
"Year" -> 2018, "Score" -> 300|>}];
dsData
Let us convert it to a “proper” long form with ID variables all except “Score”:
dsLong = ToLongForm[dsData, {"Name", "Born", "Year"}, Automatic]
Here we change the values of the column “Variable” to have the values of the column “Year” as suffixes:
dsLong2 =
dsLong[All,
Join[KeyTake[#, {"Name", "Born", "Value"}], <|
"Variable" ->
StringRiffle[{#Variable, ToString[#Year]}, "_"]|>] &]
Here we convert to wide form with ID variables “Name” and “Born”:
ToWideForm[dsLong2, {"Name", "Born"}, "Variable", "Value"]
Answered by Anton Antonov on December 1, 2020
Both Sjoerd's and Anton's answers work nicely. Upon Sjoerd's solution using MergeByKey
from the Wolfram Function Repository, I just wanted to post here for completeness that one can find almost the same solution with Merge
from the Wolfram Language:
ds = Dataset[{
<|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2017, "Score" -> 115|>,
<|"Name" -> "Steven", "Born" -> 1980, "Year" -> 2018, "Score" -> 230|>,
<|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2017, "Score" -> 70 |>,
<|"Name" -> "Joe", "Born" -> 1981, "Year" -> 2018, "Score" -> 300|>
}];
ds[GroupBy["Name"], Merge[#, Identity] &][Values,
Append[Map[First, KeyDrop[#, {"Year", "Score"}]],
AssociationThread["Score_" <> ToString[#] & /@ #Year, #Score]] &]
Answered by Stephan on December 1, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP