Postgresql 9.6 best settings for large admin work (min_wal and max_wal)

Question

We have a server that's disconnected from the world. It's a high end system with 48GB memory and 500GB SSD hard disk, 16 core CPUs. We're trying to do a pg_restore of a simple database dump of less than 10 tables, no binary data or blobs, just simple text (a comments system). But one table has about 200GB of data, so it's large.

There's no other jobs for this DB. Only this maintenance task. What are the best settings for this purpose given the config above? PGSQL's documentation does not help me too much. My specific question is around wal settings.

If we're OK with using the whole of this server just to do a pg_restore, and nothing else other than PG is on this server, what settings should we use? This is what we have now:

maintenance_work_mem = 1500MB 
fsync = off
synchronous_commit = off
wal_level = minimal
full_page_writes = off
wal_buffers = 64MB
#-----  checkpoint_segments = 512
#-----  max_wal_size = (3 * checkpoint_segments) * 16MB
#-- min_wal_size = 100MB    # 80MB is the default
max_wal_size = 24576MB   # based on 512 checkpoint_segments 
max_wal_senders = 0
wal_keep_segments = 0
archive_mode = off
autovacuum = off

Note that using top, we find that memory is not the issue. The CPU cores are spiking to 100 or so, and then dipping. This is an intensive write process, so that makes sense. Welcome any simple-to-understand guidance on how the min_wal_size should be set -- note that it's commented for us now.

kgrittn · Answer

I've given a +1 to Strahinja's answer, but have more to add than is likely to fit in a comment, so adding an answer.

The only quibble I have with any advice from Strahinja is that the optimal WAL sizes may vary depending on hardware and OS.  I have seen benchmarks where leaving sizes at the default performed better than boosting them for restoring a dump.  That was a surprise, but it does happen sometimes.  If you do boost them, I would recommend setting the sizes to equal values not more than a third of the size of that dedicated drive that was recommended.

I'm fine with disabling autovacuum for this purpose, but be sure not to consider your restore done until you run VACUUM ANALYZE; as a database superuser in the database.  This will set hint bits and build the visibility map and free space map efficiently, rather than that burdening your foreground queries at first.  The statistics will help with good planning.

And be absolutely sure you set the configuration back to a sane production configuration and restart the database service before starting any production work on the system!

Strahinja Kustudic · Answer

I guess you only want to test if the restoration of the dump works and nothing more, which means you can do some unsafe configuration changes. Let's first start with your settings.
These one are a good call:
fsync = off
synchronous_commit = off
full_page_writes = off
wal_level = minimal
autovacuum = off

These three are only important if you are using replication, and because you already set wal_level to minimum, you are not using it, so they are not important:
wal_keep_segments = 0
archive_mode = off
max_wal_senders = 0

You have a lot of RAM which won't be used for anything, I would bump this one up even more:
maintenance_work_mem = 3GB

I would leave wal_buffers to it's default:
wal_buffers = -1

and bump shared_buffers (wall_buffers will be automatically calculated):
shared_buffers: 4GB

What you should try to concentrate is to have no, or as less checkpoints as you can during your restore. Checkpoints are controlled by max_wal_size and checkpoint_timeout. First of bump checkpoint_timeout to something like 20h, so that a timed checkpoint never happens while you restore:
checkpoint_timeout = 20h

Then you can set max_wal_size to a value as high as your disk space allows you. If your restored DB is 200GB and your disk 500GB, you should be safe to set max_wal_size to 100GB because Postgres can store up to two checkpoints of wals (which is 2xmax_wal_size):
max_wal_size = 100GB

min_wal_size won't matter that much in your case, but you can probably bump it to 10GB
min_wal_size = 10GB

I would also recommend that you use pg_restore with --jobs=NUM where NUM would probably be the number of CPU cores, but it also depends on the speed of your SSD, so you can play with this parameter.
Besides Postgres settings, I would also recommend that if possible you add an additional SATA drive (7200RPM will be fine) and symlink pg_wal directory to that SATA disk. That is the directory where Postgres saves WALs, and because they are written in append, SATA is fast enough for them. It will reduced the load on the SSD, but will also mean you will be able to bump max_wal_size even more (depending on the size of the SATA disk).
Finally don't forget to restore your settings to the sane values after the dump is restored.

Postgresql 9.6 best settings for large admin work (min_wal and max_wal)

2 Answers

Add your own answers!

Ask a Question