TransWikia.com

Out of memory, but swap available

Unix & Linux Asked on October 31, 2021

My server runs out of memory even though there is swap available.

Why?

I can reproduce it this way:

eat_20GB_RAM() {
  perl -e '$a="c"x10000000000;print "OKn";sleep 10000';
}
export -f eat_20GB_RAM
parallel -j0 eat_20GB_RAM ::: {1..25} &

When that stabilizes (i.e. all processes reach sleep) I run a few more:

parallel --delay 5 -j0 eat_20GB_RAM ::: {1..25} &

When that stabilizes (i.e. all processes reach sleep) around 800 GB RAM/swap is used:

$ free -m
              total        used        free      shared  buff/cache   available
Mem:         515966      440676       74514           1         775       73392
Swap:       1256720      341124      915596

When I run a few more:

parallel --delay 15 -j0 eat_20GB_RAM ::: {1..50} &

I start to get:

Out of memory!

even though there is clearly swap available.

$ free
              total        used        free      shared  buff/cache   available
Mem:      528349276   518336524     7675784       14128     2336968     7316984
Swap:    1286882284  1017746244   269136040

Why?

$ cat /proc/meminfo 
MemTotal:       528349276 kB
MemFree:         7647352 kB
MemAvailable:    7281164 kB
Buffers:           70616 kB
Cached:          1503044 kB
SwapCached:        10404 kB
Active:         476833404 kB
Inactive:       20837620 kB
Active(anon):   476445828 kB
Inactive(anon): 19673864 kB
Active(file):     387576 kB
Inactive(file):  1163756 kB
Unevictable:       18776 kB
Mlocked:           18776 kB
SwapTotal:      1286882284 kB
SwapFree:       269134804 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:      496106244 kB
Mapped:           190524 kB
Shmem:             14128 kB
KReclaimable:     753204 kB
Slab:           15772584 kB
SReclaimable:     753204 kB
SUnreclaim:     15019380 kB
KernelStack:       46640 kB
PageTables:      3081488 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    1551056920 kB
Committed_AS:   1549560424 kB
VmallocTotal:   34359738367 kB
VmallocUsed:     1682132 kB
VmallocChunk:          0 kB
Percpu:           202752 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
ShmemHugePages:        0 kB
ShmemPmdMapped:        0 kB
FileHugePages:         0 kB
FilePmdMapped:         0 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
Hugetlb:               0 kB
DirectMap4k:    12251620 kB
DirectMap2M:    522496000 kB
DirectMap1G:     3145728 kB

One Answer

In /proc/meminfo you find:

CommitLimit:    1551056920 kB
Committed_AS:   1549560424 kB

So you are at the commit limit.

If you have disabled overcommiting of memory (to avoid the OOM-killer) by:

echo 2 > /proc/sys/vm/overcommit_memory

Then the commit limit is computed as:

2   -   Don't overcommit. The total address space commit
        for the system is not permitted to exceed swap + a
        configurable amount (default is 50%) of physical RAM.
        Depending on the amount you use, in most situations
        this means a process will not be killed while accessing
        pages but will receive errors on memory allocation as
        appropriate.

(From: https://www.kernel.org/doc/Documentation/vm/overcommit-accounting)

You can use the full memory by:

echo 100 > /proc/sys/vm/overcommit_ratio

Then you will get out-of-memory when physical RAM and swap is all reserved.

The name overcommit_ratio is in this case a bit misleading: You are not overcommitting anything.

Even with this setup you may see out-of-memory before swap is exhausted. malloc.c:

#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <unistd.h>

void main(int argc, char **argv) {
  long bytes, sleep_sec;
  if(argc != 3) {
    printf("Usage: malloc bytes sleep_secn");
    exit(1);
  }
  sscanf(argv[1],"%ld",&bytes);
  sscanf(argv[2],"%ld",&sleep_sec);
  printf("Bytes: %ld Sleep: %ldn",bytes,sleep_sec);
  if(malloc(bytes)) {
    sleep(sleep_sec);
  } else {
    printf("Out of memoryn");
    exit(1);
  }
}

Compile as:

gcc -o malloc malloc.c

Run as (reserve 1 GB for 10 seconds):

./malloc 1073741824 10

If you run this you may see OOM even though there is swap free:

# Plenty of ram+swap free before we start
$ free -m
              total        used        free      shared  buff/cache   available
Mem:         515966        2824      512361          16         780      511234
Swap:       1256720           0     1256720

# Reserve 1.8 TB
$ ./malloc 1800000000000 100 &
Bytes: 1800000000000 Sleep: 100

# It looks as if there is plenty of ram+swap free
$ free -m
              total        used        free      shared  buff/cache   available
Mem:         515966        2824      512361          16         780      511234
Swap:       1256720           0     1256720

# But there isn't: It is all reserved (just not used yet)
$ cat /proc/meminfo |grep omm
CommitLimit:    1815231560 kB
Committed_AS:   1761680484 kB

# Thus this fails (as you would expect)
$ ./malloc 180000000000 100
Bytes: 180000000000 Sleep: 100
Out of memory

So while free in practice often will do The Right Thing, looking at CommitLimit and Committed_AS seems to be more bullet-proof.

Answered by Ole Tange on October 31, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP