Database Administrators Asked by Jodocus on October 28, 2021
I am working in computational science and we are frequently producing large amounts of highly-structured data that depends on various input parameters. Right now, storing the data is managed via files, but analyzing and querying the data by using relational databases and SQL seems way superior to me.
I am suggesting to use a central database server (i.e. PostgreSQL), which is especially helpful when it comes to reproducably reporting and "backing up" the data or making it accessible to others. However, sometimes the data we produce is "worthless" and known that it can be immediatly discarded after quick inspection, thus, it makes only sense to store the data (if at all) on a central server if it has any worth. Each of this data-production runs can generate data usually from 100s of KB to 10s of GB, so depending on the amount, transfer through the network to the server can be expensive. While it is no problem when querying data (as you can decide what you need on server side), when filling the database, large amounts of data have to be moved.
What I would like to do is to locally generate the data (i.e. written to a local hard drive) and then explicitly (or implictly during night times) commit the data to the real SQL server if it is good. It would be nice if the codes that generate this data don’t have to be adapted for each case where the data is determined for the local database or to be directly sent to the remote server, somewhat similar to git where I can push a local repository to the remote one (and vice-versa, if required).
I was thinking of combining SQLite and PostgreSQL, such that local SQLite databases are migrated to the server if required. However, if done like this, only features supported by both systems can be used (which is not bad, but a limitation). Also, migrating databases and table structures from SQLite to PostgreSQL is surely not trivial, I guess it requires some additional scripting.
The other option would be to install local PostgreSQL servers on each computer (which seems overkill to me), but they would be more compatible with the server + I could use features like pg_dump to migrate the data + its table structure.
As I am pretty unexperienced on the possibilities, is there already some kind of (Open source) software available that can approximately do this or do I have to come up with a complete home-brew solution?
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP