TransWikia.com

Issues Applying Selections to Hierarchical Datasets

Mathematica Asked on November 20, 2020

I have a large dataset in the format below. I’m able to graph them as polygons easily enough, but am having trouble doing fine selections of the coordinates in Dataset format for further checks and processing. (I have other functions downstream that do geographic intersections that are being thrown off.) Some days I feel like I’m finally starting to understand Mathematica dataset operators, and other days I’m just confused.

ds = {{<|lat -> 49.275, lng -> -123.03|>, <|lat -> 49.2753, lng -> -123.03|>, <|lat -> 49.2753, lng -> -123.03|>, <|lat -> 49.275, lng -> -123.03|>, <|lat -> 49.275, lng -> -123.03|>},
{<|lat -> 49.275, lng -> -123.029|>, <|lat -> 49.2753, lng -> -123.029|>, <|lat -> 49.2753, lng -> -123.029|>, <|lat -> 49.275, lng -> -123.029|>, <|lat -> 49.275, lng -> -123.029|>}, 
{<|lat -> 49.275, lng -> -123.029|>, <|lat -> 49.275, lng -> -123.029|>, <|lat -> 49.2753, lng -> -123.029|>, <|lat -> 49.2753, lng -> -123.029|>, <|lat -> 49.275, lng -> -123.029|>}}
// Dataset

I have a simple check that I wanted to break out because I will probably add complexity later. I thought applying it to the dataset would be straightforward but nothing is being returned from the several configurations I’ve tried. I thought I was following the "Select Elements from Dataset" doc closely, but I’m still not connecting what’s written there to the behaviour I’m seeing in this example. I assumed the operators would take the little mini-dataset polygon as input but I assume something else is happening instead.

testcoords[shape_] := shape[Min, "lat"] > 40 

testcoords[ds[[2]]]. (* true *)

ds[Select[testcoords]] 
(* returns empty Dataset *)

ds[All, Select[testcoords]] 
(* returns {} ... *)

How can I configure a selection operator to check each polygon?

One Answer

I don't think your code is doing what you think it's doing. For example:

ds[1, 1, "lat"]
(* Missing[KeyAbsent, lat] *)

ds[[2]][Min]
(* -123.029 *)

ds[[2]][Min, "lat"]
(* Infinity *)

The first one fails partly because you use lat in your dataset but "lat" in your testcoords and partly because I don't think Dataset is expecting a variable as a part specification so ds[1, 1, lat] also fails because lat has no value. I could be wrong, but I think you need to put quotes around all the lat and lng variables for them to work properly in a Dataset.

When you call testcoords[ds[[2]]], I think that's the same as calling ds[[2]][Min, "lat"]. I don't really understand what that means, but it's returning Infinity which I can't imagine is what you want.

Your Dataset doesn't look like any of the examples I've come across so far where each column has its own label. I'm not familiar enough with Dataset to figure out how to select items with the way you've set up your Dataset. If you can, I would recommend working with lists directly, but if you need it to be a Dataset then hopefully someone else will have a better answer.

data = {lat, lng} /. Normal[ds]
testcoords[shape_] := Min[shape[[All, 1]]] > 40
Select[data, testcoords]

Where ds is exactly as you've defined it in your question. If you have the list of associations before it's turned into a Dataset, you can drop the Normal. I find working with lists easier when it comes to graphics as well:

Graphics[Polygon/@Select[data, testcoords]]

This doesn't show anything since many of the coordinates are identical, but it is fairly straightforward. If you want the order of latitude and longitude switched, you can just do:

data = {lng, lat} /. Normal[ds]

instead, but make sure you're testing the second coordinate with testcoords.

Answered by MassDefect on November 20, 2020

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP