TransWikia.com

How does one make and use temporal joins of CSV files in QGIS?

Geographic Information Systems Asked by Dave X on June 7, 2021

I have two CSV files, one with spatial information, and one with temporal information, to be joined on id

It looks like there are several methods,

  1. Layer/Add Layer/Add Vector Layer of both CSVs (or .vrts of .csvs) and then do a Properties/Join on the spatial layer to make an in-memory layer

  2. Use MMQGIS/Combine/Attributes join from CSV file to make on-disk layer

  3. Use processing/Vector General/Join attributes by field value

Samples of my files are like:

 station_in_gis.csv: 222 stations:

"lon","lat","id","x","y","z","note"
-76.11299003247764,36.96681296908706,1,97289.70404745551,4115155.18968458,0.0,"    !CBBT"
-75.98798724775321,37.165139869138926,2,110001.7354024933,4137233.7577695996,0.0,"    !Kipp"
...

staout_melted.csv: 220 stations x 166 times = 36852 rows

"id","secs","valid_time","elev"
1,1800.0,"2020-10-22 00:30:00",0.0186178
1,3600.0,"2020-10-22 01:00:00",0.051513800000000005
1,5400.0,"2020-10-22 01:30:00",0.0897535
1,7200.0,"2020-10-22 02:00:00",0.124426
1,9000.0,"2020-10-22 02:30:00",0.198678
1,10800.0,"2020-10-22 03:00:00",0.292794
1,12600.0,"2020-10-22 03:30:00",0.40008499999999997
1,14400.0,"2020-10-22 04:00:00",0.497468
1,16200.0,"2020-10-22 04:30:00",0.5340550000000001
2,1800.0,"2020-10-22 00:30:00",-0.036708199999999996
2,3600.0,"2020-10-22 01:00:00",0.000558725
2,5400.0,"2020-10-22 01:30:00",0.0470811
2,7200.0,"2020-10-22 02:00:00",0.08760880000000001
2,9000.0,"2020-10-22 02:30:00",0.148671
2,10800.0,"2020-10-22 03:00:00",0.23444600000000002
2,12600.0,"2020-10-22 03:30:00",0.33513699999999996
2,14400.0,"2020-10-22 04:00:00",0.44592600000000004
2,16200.0,"2020-10-22 04:30:00",0.512962
2,18000.0,"2020-10-22 05:00:00",0.548772
...

I did write a couple .vrt files:

<OGRVRTDataSource>
    <OGRVRTLayer name="staout_melted">
        <SrcDataSource relativeToVRT="1">staout_melted.csv</SrcDataSource>
        <Field name="id"  type="String"/>
        <Field name="secs"  type="Real"/>
        <Field name="valid_time"  type="DateTime" />
        <Field name="elev" type="Real" />
    </OGRVRTLayer>
</OGRVRTDataSource>




<OGRVRTDataSource>
    <OGRVRTLayer name="station_in_gis">
        <SrcDataSource relativeToVRT="1">station_in_gis.csv</SrcDataSource>
        <GeometryType>wkbPoint</GeometryType>
        <LayerSRS>WGS84</LayerSRS>
        <GeometryField encoding="PointFromColumns" x="lon" y="lat" />
        <Field name="lon" type="Real" />
        <Field name="lat" type="Real" />
        <Field name="id"  type="String"/>
        <Field name="x"  type="Real"/>
        <Field name="y" type="Real" />
        <Field name="z" type="Real" />
        <Field name="note"  type="String" />

    </OGRVRTLayer>
</OGRVRTDataSource>

What I would like to do is join these two together and end up with a layer that ends up time aware, ultimately enabling one to color points by elev scanning across time with the temporal control or click-to-ID station 1 (CBBT) at 2020-10-22 03:30:00 as having elev=0.4000

What steps do I need to do to:

  1. Get the CSV files into QGIS as a time-aware joined layer?
  2. Use the Temporal Controller and id to get the temporally-varied elev?
  3. Make the symbology use this time-sensitive elev data?

I’ve tried MMQGIS and the Layer/Properties/Join, but an id on a point only seems to get one time stamp, and enabling the temporal controller seems to make the points vanish. I think I’ve not joined properly.


With MMQGIS, I can use the .vrt file on the spatial file before the join, but MMQGIS doesn’t seem to join .vrt, just CSVs, so I can’t control the field types, in particular, the valid_time field. I do end up with a large inner join with 36582 spatio-temporal observations, but without temporal awareness.

With the layer/properties/join, I seem only to get my 222 stations with one single timestamp.


From the comments:

I tried a join external to QGIS using the csvkit’s csvjoin with a csvjoin -c id -u 0 station_in_gis.csv staout_melted.csv >csvjoined_schism.csv command, and, with a .vrt file this makes it into what seems to be a temporally compatible file. At the layer level I can Properties/Temporal/Single field with date/time and I get a clock icon on the layer. I can then Project/Properties/Temporal/Calculate from Layers and it sets temporal extents. But under the ‘id’ tool each point has multiple ids, one for each timestamp. Am I doing this right?

I’m able to make a Virtual layer join with /Layer/Create Layer/Add Virtual Layer on my tables with a SELECT * FROM station_in_gis, staout_melted where station_in_gis.id=staout_melted.id However, I can’t set temporal on that layer’s valid_time field though. I think I’ll need a richer SQL for the join. (it would be nice if you could recover or edit the SQL from the virtual table’s properties…You can with right-click on the layer and Edit Virtual Layer)

SELECT lon,lat,station_in_gis.id as id_0, staout_melted.id as id_1, valid_time /*:DateTime*/,elev, station_in_gis.geometry FROM station_in_gis, staout_melted where station_in_gis.id = staout_melted.id seems to work to Join, in Layer/Create Layer/Add Virtual Layer as shown in the screenshot below, but the valid_time columns remains stubbornly a String field.

Create Virtual Layer screenshot

One Answer

The best performance seems to be that one can join spatial and temporal CSV files outside of QGIS with a database program or csvkit's csvjoin:

 csvjoin -c id -u 0 station_in_gis.csv staout_melted.csv > csvjoined_schism.csv 

with a .vrt file:

<OGRVRTDataSource>
    <OGRVRTLayer name="csvjoined_schism">
        <SrcDataSource relativeToVRT="1">csvjoined_schism.csv</SrcDataSource>
        <GeometryType>wkbPoint</GeometryType>
        <LayerSRS>WGS84</LayerSRS>
        <GeometryField encoding="PointFromColumns" x="lon" y="lat" />
        <Field name="lon" type="Real" />
        <Field name="lat" type="Real" />
        <Field name="id"  type="String"/>
        <Field name="x"  type="Real"/>
        <Field name="y" type="Real" />
        <Field name="z" type="Real" />
        <Field name="secs"  type="Real"/>
        <Field name="valid_time"  type="DateTime" />
        <Field name="elev" type="Real" />
        <Field name="note"  type="String" />

    </OGRVRTLayer>
</OGRVRTDataSource>

This makes it into a temporally compatible file. At the layer level you can Properties/Temporal/Single field with date/time and get a clock icon on the layer overview. You can then Project/Properties/Temporal/Calculate from Layers and it sets temporal extents of the project to match the layers.

The symbology can be set as normal, but by using the 'Data Driven Override` boxes to the right of the symbology-defining fields, you can make the symbology vary with time.

enter image description here

enter image description here

Another way is to add the spatially aware CSV file and the temporally aware VRT file as separate layers using their VRT files, and then join the files with /Layer/Create Layer/Add Virtual Layer/ with SQL like:

SELECT lon,lat,station_in_gis.id as id_0, staout_melted.id as id_1, valid_time ,elev, station_in_gis.geometry FROM station_in_gis, staout_melted where station_in_gis.id = staout_melted.id 

...but virtual layers use SQLITE and do not represent DateTime fields. You can work around this by using expressions with the for the temporal information. Under Layer/Properties/Temporal you can apply an expression to a field to get the temporal timestamps with Temporal/Start and End Date/Time from Expressions with, for example to_datetime( valid_time)-make_interval(minutes:=15) and to_datetime( valid_time)+make_interval(minutes:=15):

The virtual join seems to work, but the performance seems to be slow compared to the externally-joined table.

enter image description here

Answered by Dave X on June 7, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP