Geographic Information Systems Asked by Dave X on June 7, 2021
I have two CSV files, one with spatial information, and one with temporal information, to be joined on id
It looks like there are several methods,
Layer/Add Layer/Add Vector Layer of both CSVs (or .vrts of .csvs) and then do a Properties/Join on the spatial layer to make an in-memory layer
Use MMQGIS/Combine/Attributes join from CSV file to make on-disk layer
Use processing/Vector General/Join attributes by field value
Samples of my files are like:
station_in_gis.csv: 222 stations:
"lon","lat","id","x","y","z","note"
-76.11299003247764,36.96681296908706,1,97289.70404745551,4115155.18968458,0.0," !CBBT"
-75.98798724775321,37.165139869138926,2,110001.7354024933,4137233.7577695996,0.0," !Kipp"
...
staout_melted.csv: 220 stations x 166 times = 36852 rows
"id","secs","valid_time","elev"
1,1800.0,"2020-10-22 00:30:00",0.0186178
1,3600.0,"2020-10-22 01:00:00",0.051513800000000005
1,5400.0,"2020-10-22 01:30:00",0.0897535
1,7200.0,"2020-10-22 02:00:00",0.124426
1,9000.0,"2020-10-22 02:30:00",0.198678
1,10800.0,"2020-10-22 03:00:00",0.292794
1,12600.0,"2020-10-22 03:30:00",0.40008499999999997
1,14400.0,"2020-10-22 04:00:00",0.497468
1,16200.0,"2020-10-22 04:30:00",0.5340550000000001
2,1800.0,"2020-10-22 00:30:00",-0.036708199999999996
2,3600.0,"2020-10-22 01:00:00",0.000558725
2,5400.0,"2020-10-22 01:30:00",0.0470811
2,7200.0,"2020-10-22 02:00:00",0.08760880000000001
2,9000.0,"2020-10-22 02:30:00",0.148671
2,10800.0,"2020-10-22 03:00:00",0.23444600000000002
2,12600.0,"2020-10-22 03:30:00",0.33513699999999996
2,14400.0,"2020-10-22 04:00:00",0.44592600000000004
2,16200.0,"2020-10-22 04:30:00",0.512962
2,18000.0,"2020-10-22 05:00:00",0.548772
...
I did write a couple .vrt files:
<OGRVRTDataSource>
<OGRVRTLayer name="staout_melted">
<SrcDataSource relativeToVRT="1">staout_melted.csv</SrcDataSource>
<Field name="id" type="String"/>
<Field name="secs" type="Real"/>
<Field name="valid_time" type="DateTime" />
<Field name="elev" type="Real" />
</OGRVRTLayer>
</OGRVRTDataSource>
<OGRVRTDataSource>
<OGRVRTLayer name="station_in_gis">
<SrcDataSource relativeToVRT="1">station_in_gis.csv</SrcDataSource>
<GeometryType>wkbPoint</GeometryType>
<LayerSRS>WGS84</LayerSRS>
<GeometryField encoding="PointFromColumns" x="lon" y="lat" />
<Field name="lon" type="Real" />
<Field name="lat" type="Real" />
<Field name="id" type="String"/>
<Field name="x" type="Real"/>
<Field name="y" type="Real" />
<Field name="z" type="Real" />
<Field name="note" type="String" />
</OGRVRTLayer>
</OGRVRTDataSource>
What I would like to do is join these two together and end up with a layer that ends up time aware, ultimately enabling one to color points by elev
scanning across time with the temporal control or click-to-ID station 1
(CBBT) at 2020-10-22 03:30:00
as having elev=0.4000
What steps do I need to do to:
elev
?elev
data?I’ve tried MMQGIS and the Layer/Properties/Join, but an id
on a point only seems to get one time stamp, and enabling the temporal controller seems to make the points vanish. I think I’ve not joined properly.
With MMQGIS, I can use the .vrt file on the spatial file before the join, but MMQGIS doesn’t seem to join .vrt, just CSVs, so I can’t control the field types, in particular, the valid_time field. I do end up with a large inner join with 36582 spatio-temporal observations, but without temporal awareness.
With the layer/properties/join, I seem only to get my 222 stations with one single timestamp.
From the comments:
I tried a join external to QGIS using the csvkit’s csvjoin
with a csvjoin -c id -u 0 station_in_gis.csv staout_melted.csv >csvjoined_schism.csv
command, and, with a .vrt file this makes it into what seems to be a temporally compatible file. At the layer level I can Properties/Temporal/Single field
with date/time and I get a clock icon on the layer. I can then Project/Properties/Temporal/Calculate
from Layers and it sets temporal extents. But under the ‘id’ tool each point has multiple ids, one for each timestamp. Am I doing this right?
I’m able to make a Virtual layer join with /Layer/Create Layer/Add Virtual Layer
on my tables with a SELECT * FROM station_in_gis, staout_melted where station_in_gis.id=staout_melted.id
However, I can’t set temporal on that layer’s valid_time field though. I think I’ll need a richer SQL for the join. (it would be nice if you could recover or edit the SQL from the virtual table’s properties…You can with right-click on the layer and Edit Virtual Layer
)
SELECT lon,lat,station_in_gis.id as id_0, staout_melted.id as id_1, valid_time /*:DateTime*/,elev, station_in_gis.geometry FROM station_in_gis, staout_melted where station_in_gis.id = staout_melted.id
seems to work to Join, in Layer/Create Layer/Add Virtual Layer
as shown in the screenshot below, but the valid_time
columns remains stubbornly a String field.
The best performance seems to be that one can join spatial and temporal CSV files outside of QGIS with a database program or csvkit
's csvjoin:
csvjoin -c id -u 0 station_in_gis.csv staout_melted.csv > csvjoined_schism.csv
with a .vrt file:
<OGRVRTDataSource>
<OGRVRTLayer name="csvjoined_schism">
<SrcDataSource relativeToVRT="1">csvjoined_schism.csv</SrcDataSource>
<GeometryType>wkbPoint</GeometryType>
<LayerSRS>WGS84</LayerSRS>
<GeometryField encoding="PointFromColumns" x="lon" y="lat" />
<Field name="lon" type="Real" />
<Field name="lat" type="Real" />
<Field name="id" type="String"/>
<Field name="x" type="Real"/>
<Field name="y" type="Real" />
<Field name="z" type="Real" />
<Field name="secs" type="Real"/>
<Field name="valid_time" type="DateTime" />
<Field name="elev" type="Real" />
<Field name="note" type="String" />
</OGRVRTLayer>
</OGRVRTDataSource>
This makes it into a temporally compatible file. At the layer level you can Properties/Temporal/Single field with date/time
and get a clock icon on the layer overview. You can then Project/Properties/Temporal/Calculate from Layers
and it sets temporal extents of the project to match the layers.
The symbology can be set as normal, but by using the 'Data Driven Override` boxes to the right of the symbology-defining fields, you can make the symbology vary with time.
Another way is to add the spatially aware CSV file and the temporally aware VRT file as separate layers using their VRT files, and then join the files with /Layer/Create Layer/Add Virtual Layer/
with SQL like:
SELECT lon,lat,station_in_gis.id as id_0, staout_melted.id as id_1, valid_time ,elev, station_in_gis.geometry FROM station_in_gis, staout_melted where station_in_gis.id = staout_melted.id
...but virtual layers use SQLITE and do not represent DateTime fields. You can work around this by using expressions with the for the temporal information. Under Layer/Properties/Temporal
you can apply an expression to a field to get the temporal timestamps with Temporal/Start and End Date/Time from Expressions
with, for example to_datetime( valid_time)-make_interval(minutes:=15)
and to_datetime( valid_time)+make_interval(minutes:=15)
:
The virtual join seems to work, but the performance seems to be slow compared to the externally-joined table.
Answered by Dave X on June 7, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP