TransWikia.com

Converting xls into csv in QGIS with correct column definitions and exporting the data into postgis

Geographic Information Systems Asked by rtaani on January 13, 2021

I’m using QGIS 3.10.2 A Coruna.
After reading this tutorial (https://ieqgis.wordpress.com/2015/02/08/importing-csv-files-into-postgresql-using-the-db-manager-in-qgis/) I thought I had finally found a simple way of transforming xls-Data in csv and then loading this csv-Data into Postgis. All actions taken before in Loading data directly into pgAdmin were very straining because there were always problems with importing the data with the correct column defitions. When converting xls into csv via qgis all columns are converted into strings, which is incorrect.
Are there any thoughts how excel sheets can be transformed to csv with correct column definitions and imported with these correct formats into Postgis?

2 Answers

You have to add a csvt file so QGIS can read the intended datatypes from there.

See QGIS 3.2 - Forcing column type when importing csv

Consider not using CSV but instead XLS/XLSX so that you can possibly avoid all that.

Correct answer by bugmenot123 on January 13, 2021

I would avoid depending on QGIS to load CSV data, especially given the amount of non-spatial data that CSV's are likely to have in which data types can be mis-read.

Instead, I recommend using CSVKit to not only define the columns in the tables that will result from your CSV import, but then use PostGIS functions to build the spatial data, etc.

CSVKit can read a CSV and create a column definition:

csvsql -i postgresql crime.csv

Yields this result:

CREATE TABLE crime (
    "INCIDENT_ID" FLOAT NOT NULL,
    "OFFENSE_ID" BIGINT NOT NULL,
    "OFFENSE_CODE" VARCHAR(4) NOT NULL,
    "OFFENSE_CODE_EXTENSION" INTEGER NOT NULL,
    "OFFENSE_TYPE_ID" VARCHAR(30) NOT NULL,
    "OFFENSE_CATEGORY_ID" VARCHAR(28) NOT NULL,
    "FIRST_OCCURRENCE_DATE" TIMESTAMP WITHOUT TIME ZONE NOT NULL,
    "LAST_OCCURRENCE_DATE" TIMESTAMP WITHOUT TIME ZONE,
    "REPORTED_DATE" TIMESTAMP WITHOUT TIME ZONE NOT NULL,
    "INCIDENT_ADDRESS" VARCHAR(97),
    "GEO_X" FLOAT NOT NULL,
    "GEO_Y" FLOAT NOT NULL,
    "GEO_LON" FLOAT,
    "GEO_LAT" FLOAT,
    "DISTRICT_ID" INTEGER,
    "PRECINCT_ID" INTEGER,
    "NEIGHBORHOOD_ID" VARCHAR(26),
    "IS_CRIME" INTEGER NOT NULL,
    "IS_TRAFFIC" INTEGER NOT NULL
);

But even better, you accomplish the above and load the CSV in one command:

csvsql --db postgresql://username:password@servername/databasename --table denver_crime --insert crime.csv

Here is a tutorial I created when I need to do just this:

https://github.com/dpsspatial/Installation-Instructions/blob/master/csvkit.md

I also recommend using DBeaver instead of PGAdmin for any of this work, as it is a much more user-friendly / analyst-designed GUI to your database than the DBA-designed PGAdmin (I will have to update the tutorial screenshots - PGAdmin 3 was OK, PGAdmin 4 is way too heavy, and luckily DBeaver came along for us at the right time).

Answered by DPSSpatial on January 13, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP