packet_write_wait Broken pipe even leaving top running?

Question

This bloody error makes my headache going bigger and bigger everyday. I never met a same situation like this time.

Well, after I authenticated into SSH successfully, doing few stuffs then my SSH connection being dropped suddenly!!?

Here is my error message: packet_write_wait: Connection to XXX.XX.XX.XXX: Broken pipe

I wished my error message look like this: Write Failed: broken pipe a lot, believe me!

I tried a tons of resolution on the Internet like added ServerAliveInterval, ServerAliveCountMax, ClientAlive....

Someone said: Turn your TCPKeepAlive to no, added ServerAlive bllah blah idiot. I did that also but still same error.

There is no luck for me until this moment.

Any help will be appreciate.

Toan Nguyen · Accepted Answer

Dear 2018 and later readers,

Let me show you a comment from MelBurslan,

If you are in a corporate environment, check with your firewall admins and see if they were updating rules and/or restarting the firewall after some sort of a change when this happens. If it is happening to a personal server of yours, you need to provide more information on what were you doing on the sshd server side, when this happened. Broken pipe generally means there was a network disconnect for some reason.

So basically, if you are trying to use ssh username@0.0.0.0 over a VPN (corporate environment). Then this error must be there with you over and over.

The only solution I found so far is mobile-shell. Thanks who created it.

You will need to install mosh-server in your target (the server you want to ssh'ed to) and mosh-client in your host machine.

It will auto reconnect when your packets lost, that's pretty cool and suit all our needs, I think.

Update 03/2020:

If you can't install mosh-server on your servers, then you could use my script here: https://github.com/ohmybash/oh-my-bash/blob/master/tools/autossh.sh

It will auto-reconnect to SSH automatically whenever SSH session dead.

Happy ssh'ing!

knowah · Answer

I kept getting this error when connecting remotely to a server based in my office. We do not use a VPN so all external connections go through a proxy server. The IT department had provided the following SSH configuration entry, which I copied straight into my ~/.ssh/config:

Host xx.xx.xx.xx
  ProxyCommand ssh username@proxyserver exec netcat -w 5 %h %p

I don't know why they included the -w 5 because that sets a very short timeout of 5 seconds of inactivity before the connection is broken. Removing or increasing the -w parameter eliminates the issue (as does switching to just ProxyJump username@proxyserver, which is what IT is now recommending).

vicky penkova · Answer

ssh -o IPQoS=throughput user@{ip}

rupert160 · Answer

I discovered it was an IPQoS option issue on my VMware Guest setup.
On the VM I set the ~/.ssh/config value for IPQoS from the default of "IPQoS af21 cs1" being low latency data for interactive first and lower effort for non-interactive for the second. Setting a new value for af21 was my solution:

Host *
     IPQoS throughput

Worked for me, otherwise yes MoSH is also worked, but mosh doesn't handling my Proxy setup in a convenient way so I stick with ProxyJump commands in

DonFeraRRi · Answer

Open the ssh.config file on the target server with the below command:

sudo nano /etc/ssh/ssh.config

Add the below lines at the end of that file

ClientAliveInterval 300
  
  ClientAliveCountMax 2

press Ctrl+o and enter.

sudo reboot

This acutally worked for me. I was in the same situation. Tried this and that but just follow these steps. Only this.
I hope it will work for you too.

wget · Answer

First, make sure your issue is not related to this one.

If not and the problem is still present, read on.

I experienced this problem as well and spent a few days tried to bissect it.

Like specified, playing with SSH KeepAlive parameters or kernel TCP parameters (TCPKeepAlive on/off) does not solve the problem.

After playing with usb to ethernet drivers and TCP dump, I realized the issue was due to the kernel 4.8. I switched the source (sending side) to 4.4 LTS and the problem disappeared (rsync, scp were working nicely again). The destination side can remain on 4.8 if you want, in my use case this was working (tested).

On the technical side, we can narrow a little bit the issue thanks to the wireshark dump below I made. We can see the TCP channel of the SSHv2 protocol is being reset (RST flag of TCP set to 1) causing the connection to abort. I don't know the cause of the RST yet. I need to make some bisection from 4.8.1 to 4.8.11 for that.

I'm not saying your problem is specifically due to the kernel 4.8, but wrt. the date you posted your question/message, you may have been using a kernel version which was actually buggy.

Answered initially on StackOverflow.

packet_write_wait Broken pipe even leaving top running?

6 Answers

Add your own answers!

Ask a Question