Unix & Linux Asked on January 15, 2021
I have spent months on this problem, and I’m just about at my wits’ end. I have a home media server that runs docker to run containers. I have a docker-compose file that I have all my stuff defined in. The box itself is given a static IP by the network (eero in this case). I run docker-compose up -d
, and leave it to host my stuff.
Between a week and a day (it’s inconsistent), the machine will just lose connectivity from the network. The current network setup is modem –> eero –> network switch –> server. The only way for me to reconnect to the server is to reboot it. Only then does the network come back online. I had this problem with Debian (happened on both 9 and 10) originally, but I changed my OS since a friend of mine runs Ubuntu without issue. I switched to Ubuntu Server (20), but have the same issue. Briefly, I did look at https://github.com/moby/moby/issues/36153 as a possible root cause, but adding the files suggested didn’t seem to make a difference.
The next consideration was that maybe it was a hardware issue, so I switched from using my onboard ethernet to using a USB-C ethernet adapter. That seemed to work for 3 days, but then I had the same problem.
At this point, I’m lost as to what I can do to narrow down the problem. I’ve looked through syslog
, but nothing seems to stand out to me there. I’ve checked the container logs, but all the containers are fine. On Debian I was using Network Manager
, but on Ubuntu, I’m using systemd-networkd
. Both experience this issue.
My Ubuntu version is Ubuntu 20.04 LTS (GNU/Linux 5.4.0-37-generic x86_64)
My hardware info below in case that helps
H/W path Device Class Description
=================================================================
system System Product Name (SKU)
/0 bus PRIME X370-PRO
/0/0 memory 64KiB BIOS
/0/2c memory 16GiB System Memory
/0/2c/0 memory 8GiB DIMM DDR4 Synchronous Unbuffered (Unregistered) 2133 MHz (0.5 ns)
/0/2c/1 memory [empty]
/0/2c/2 memory 8GiB DIMM DDR4 Synchronous Unbuffered (Unregistered) 2133 MHz (0.5 ns)
/0/2c/3 memory [empty]
/0/2e memory 576KiB L1 cache
/0/2f memory 3MiB L2 cache
/0/30 memory 16MiB L3 cache
/0/31 processor AMD Ryzen 5 1600 Six-Core Processor
/0/100 bridge Family 17h (Models 00h-0fh) Root Complex
/0/100/0.2 generic Family 17h (Models 00h-0fh) I/O Memory Management Unit
/0/100/1.3 bridge Family 17h (Models 00h-0fh) PCIe GPP Bridge
/0/100/1.3/0 bus X370 Series Chipset USB 3.1 xHCI Controller
/0/100/1.3/0/0 usb1 bus xHCI Host Controller
/0/100/1.3/0/0/7 generic Belkin USB-C LAN
/0/100/1.3/0/1 usb2 bus xHCI Host Controller
/0/100/1.3/0.1 scsi0 storage X370 Series Chipset SATA Controller
/0/100/1.3/0.1/0 /dev/sda disk 120GB SanDisk SDSSDA12
/0/100/1.3/0.1/0/1 /dev/sda1 volume 511MiB Windows FAT volume
/0/100/1.3/0.1/0/2 /dev/sda2 volume 111GiB EXT4 volume
/0/100/1.3/0.1/1 /dev/sdb disk 3TB Hitachi HUS72403
/0/100/1.3/0.1/2 /dev/sdc disk 3TB Hitachi HUS72403
/0/100/1.3/0.1/3 /dev/sdd disk 3TB Hitachi HUS72403
/0/100/1.3/0.1/4 /dev/sde disk 3TB Hitachi HUS72403
/0/100/1.3/0.1/5 /dev/sdf disk 3TB Hitachi HUS72403
/0/100/1.3/0.2 bridge X370 Series Chipset PCIe Upstream Port
/0/100/1.3/0.2/0 bridge 300 Series Chipset PCIe Port
/0/100/1.3/0.2/2 bridge 300 Series Chipset PCIe Port
/0/100/1.3/0.2/3 bridge 300 Series Chipset PCIe Port
/0/100/1.3/0.2/4 bridge 300 Series Chipset PCIe Port
/0/100/1.3/0.2/4/0 bus ASM1142 USB 3.1 Host Controller
/0/100/1.3/0.2/4/0/0 usb3 bus xHCI Host Controller
/0/100/1.3/0.2/4/0/1 usb4 bus xHCI Host Controller
/0/100/1.3/0.2/6 bridge 300 Series Chipset PCIe Port
/0/100/1.3/0.2/6/0 enp7s0 network I211 Gigabit Network Connection
/0/100/1.3/0.2/7 bridge 300 Series Chipset PCIe Port
/0/100/3.2 bridge Family 17h (Models 00h-0fh) PCIe GPP Bridge
/0/100/3.2/0 display GP107 [GeForce GTX 1050]
/0/100/3.2/0.1 multimedia GP107GL High Definition Audio Controller
/0/100/7.1 bridge Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
/0/100/7.1/0 generic Zeppelin/Raven/Raven2 PCIe Dummy Function
/0/100/7.1/0.2 generic Family 17h (Models 00h-0fh) Platform Security Processor
/0/100/7.1/0.3 bus Family 17h (Models 00h-0fh) USB 3.0 Host Controller
/0/100/7.1/0.3/0 usb5 bus xHCI Host Controller
/0/100/7.1/0.3/1 usb6 bus xHCI Host Controller
/0/100/8.1 bridge Family 17h (Models 00h-0fh) Internal PCIe GPP Bridge 0 to Bus B
/0/100/8.1/0 generic Zeppelin/Renoir PCIe Dummy Function
/0/100/8.1/0.2 storage FCH SATA Controller [AHCI mode]
/0/100/8.1/0.3 multimedia Family 17h (Models 00h-0fh) HD Audio Controller
/0/100/14 bus FCH SMBus Controller
/0/100/14.3 bridge FCH LPC Bridge
/0/101 bridge Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/102 bridge Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/103 bridge Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/104 bridge Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/105 bridge Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/106 bridge Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
/0/107 bridge Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
/0/108 bridge Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
/0/109 bridge Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
/0/10a bridge Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
/0/10b bridge Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
/0/10c bridge Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
/0/10d bridge Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
/0/10e bridge Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
/0/1 system PnP device PNP0c01
/0/2 system PnP device PNP0b00
/0/3 system PnP device PNP0c02
/0/4 communication PnP device PNP0501
/0/5 system PnP device PNP0c02
/1 br-10d6cc4b0f64 network Ethernet interface
/2 veth80c7cea network Ethernet interface
/3 enx302303052de3 network Ethernet interface
/4 vethf4fd33e network Ethernet interface
/5 vethab1d028 network Ethernet interface
/6 vethb9ac1e0 network Ethernet interface
/7 veth00d454b network Ethernet interface
/8 docker0 network Ethernet interface
Here is my docker-compose file too. My current docker version is Docker version 19.03.11, build dd360c7
, and my docker-compose version is docker-compose version 1.26.0, build d4451659
.
version: "3.7"
services:
plex:
image: plexinc/pms-docker
container_name: plex
volumes:
- /mnt/plex/config:/config
- /mnt/plex/Movies:/data/movies
- /mnt/plex/Shows:/data/tvshows
- /mnt/plex/transcode:/data/transcode
ports:
- 32400:32400/tcp
- 3005:3005/tcp
- 8324:8324/tcp
- 32469:32469/tcp
- 1900:1900/udp
- 32410:32410/udp
- 32412:32412/udp
- 32413:32413/udp
- 32414:32414/udp
restart: unless-stopped
environment:
- PUID=1000
- PGID=1000
- VERSION=latest
- TZ=America/Los_Angeles
homebridge:
image: oznu/homebridge:latest
container_name: homebridge
restart: unless-stopped
network_mode: host
environment:
- TZ=America/Los_Angeles
- PGID=1000
- PUID=1000
- HOMEBRIDGE_CONFIG_UI=1
- HOMEBRIDGE_CONFIG_UI_PORT=8008
volumes:
- /mnt/homebridge:/homebridge
nzbget:
image: linuxserver/nzbget:latest
container_name: nzbget
volumes:
- /mnt/nzbget/config:/config
- /mnt/nzbget/downloads:/downloads
restart: unless-stopped
environment:
- TZ=America/Los_Angeles
- PUID=1000
- PGID=1000
ports:
- 6789:6789
sonarr:
image: linuxserver/sonarr:latest
container_name: sonarr
restart: unless-stopped
depends_on:
- nzbget
volumes:
- /mnt/sonarr/config:/config
- /mnt/nzbget/downloads:/downloads
- /mnt/plex/Shows:/tv
environment:
- TZ=America/Los_Angeles
- PUID=1000
- PGID=1000
ports:
- 8989:8989
radarr:
image: linuxserver/radarr:latest
container_name: radarr
restart: unless-stopped
depends_on:
- nzbget
volumes:
- /mnt/radarr/config:/config
- /mnt/nzbget/downloads:/downloads
- /mnt/plex/Movies:/movies
environment:
- TZ=America/Los_Angeles
- PUID=1000
- PGID=1000
ports:
- 7878:7878
tautulli:
image: linuxserver/tautulli:latest
container_name: tautulli
depends_on:
- plex
restart: unless-stopped
environment:
- TZ=America/Los_Angeles
- PUID=1000
- GUID=1000
volumes:
- /mnt/tautulli/config:/config
- /mnt/tautulli/logs:/logs:ro
ports:
- 8181:8181
If I’ve missed anything, please let me know and I’m happy to provide more information.
EDIT:
I’ve also attempted to update the Realtek driver to the latest last night to see if that may be the cause of the issue because I found the following in journalctl
Jun 14 01:17:25 phoenix kernel: xhci_hcd 0000:01:00.0: xHCI host not responding to stop endpoint command.
Jun 14 01:17:25 phoenix kernel: xhci_hcd 0000:01:00.0: xHCI host controller not responding, assume dead
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx status -108
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx status -108
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx status -108
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx status -108
Jun 14 01:17:25 phoenix kernel: xhci_hcd 0000:01:00.0: HC died; cleaning up
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Tx timeout
Jun 14 01:17:25 phoenix kernel: usb 1-7: USB disconnect, device number 2
Jun 14 01:17:25 phoenix kernel: r8152 1-7:1.0 enx302303052de3: Get ether addr fail
Jun 14 01:17:25 phoenix systemd-networkd[933]: enx302303052de3: Link DOWN
I did so following https://www.pcsuggest.com/install-rtl8153-driver-linux/. However, it seems things disconnected this morning so I can’t say for sure if this helped or not.
EDIT 2:
It seems docker may be failing or restarting due to snap?
Jun 24 05:01:47 phoenix docker.dockerd[998]: failed to start containerd: timeout waiting for containerd to start
Jun 24 05:01:47 phoenix systemd[1]: snap.docker.dockerd.service: Main process exited, code=exited, status=1/FAILURE
Jun 24 05:01:47 phoenix systemd[1]: snap.docker.dockerd.service: Failed with result 'exit-code'.
Jun 24 05:01:47 phoenix systemd[1]: snap.docker.dockerd.service: Scheduled restart job, restart counter is at 1.
Jun 24 05:01:47 phoenix systemd[1]: Stopped Service for snap application docker.dockerd.
Jun 24 05:01:47 phoenix systemd[1]: Started Service for snap application docker.dockerd.
After this I can clearly see an ip reassignment trigger which then caused my box to go offline
EDIT 3:
Here is a snippet from the iplog
[2020-07-05T00:24:28.507613] Deleted dev vetha537571 lladdr 02:42:ac:13:00:02 STALE
[2020-07-05T00:24:29.019491] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed PROBE
[2020-07-05T00:24:29.019674] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed REACHABLE
[2020-07-05T00:24:32.603688] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 STALE
[2020-07-05T00:24:59.227481] 172.19.0.6 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:06 STALE
[2020-07-05T00:25:01.275258] 172.19.0.7 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:07 STALE
[2020-07-05T00:25:30.715499] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 PROBE
[2020-07-05T00:25:30.715641] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 REACHABLE
[2020-07-05T00:25:34.299181] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed STALE
[2020-07-05T00:25:38.139499] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 STALE
[2020-07-05T00:25:38.139586] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 STALE
[2020-07-05T00:25:39.931537] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed PROBE
[2020-07-05T00:25:39.931823] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed REACHABLE
[2020-07-05T00:25:47.099314] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 PROBE
[2020-07-05T00:25:47.099401] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 PROBE
[2020-07-05T00:25:47.101034] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 REACHABLE
[2020-07-05T00:25:47.102485] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 REACHABLE
[2020-07-05T00:25:57.595220] 172.19.0.6 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:06 PROBE
[2020-07-05T00:25:57.595308] 172.19.0.6 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:06 REACHABLE
[2020-07-05T00:25:58.363503] 172.19.0.7 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:07 PROBE
[2020-07-05T00:25:58.363730] 172.19.0.7 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:07 REACHABLE
[2020-07-05T00:26:00.667505] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 STALE
[2020-07-05T00:26:12.955465] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed STALE
[2020-07-05T00:26:19.099249] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed PROBE
[2020-07-05T00:26:19.099393] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed REACHABLE
[2020-07-05T00:26:29.339502] 172.19.0.6 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:06 STALE
[2020-07-05T00:26:29.339583] 172.19.0.7 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:07 STALE
[2020-07-05T00:26:37.531222] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 STALE
[2020-07-05T00:26:37.531304] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 STALE
[2020-07-05T00:26:47.003597] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 PROBE
[2020-07-05T00:26:47.003678] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 PROBE
[2020-07-05T00:26:47.005742] 192.168.7.55 dev enp7s0 lladdr 30:23:03:01:33:c5 REACHABLE
[2020-07-05T00:26:47.007351] 192.168.7.50 dev enp7s0 lladdr 24:f5:a2:94:74:e9 REACHABLE
[2020-07-05T00:27:00.827525] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 PROBE
[2020-07-05T00:27:00.827816] 172.19.0.3 dev br-c5ca2723d156 lladdr 02:42:ac:13:00:03 REACHABLE
[2020-07-05T00:27:12.859480] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed STALE
[2020-07-05T00:27:19.003172] 192.168.7.1 dev enp7s0 lladdr 14:22:db:9c:4d:ed PROBE
This took me some more digging to find out. But eventually I had my computer running with it plugged into my monitor and I saw a CPU lockup the next time I lost network connectivity.
Some quick searching seems to point to the possibility it is a power state problem with Ryzen CPUs https://askubuntu.com/a/1259021
Following that answer, I followed this guide to disable the C6 power state https://forum.manjaro.org/t/fix-ryzen-lockups-related-to-low-system-usage/39723
I'm verging on 3 days of uptime without any issue. Currently on wifi, but intend to switch the machine back to wired. I will update in a month to see how the uptime has been since then. Hopefully this helps the next person who experiences a similar issue.
Answered by David on January 15, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP