Unix & Linux Asked on November 14, 2021
we have Hadoop cluster – the linux are rhel 7.5 machines
and namenode is using the port 50070
we can see from the log that port 50070 is in use
but the interesting thing is that when we do netstat -tulpn | grep 50070
, to find the PID , the its return nothing
netstat -tulpn | grep 50070 ( we not get output )
so how it can be ?
how to clear the port?
2020-07-18 21:26:22,753 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(606)) - NameNode metrics system shutdown complete.
2020-07-18 21:26:22,753 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
java.net.BindException: Port in use: linux.gg.com:50070
at org.apache.hadoop.http.HttpServer2.constructBindException(HttpServer2.java:1001)
at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1023)
at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:1080)
at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:937)
at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:170)
at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:942)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:755)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
at org.apache.hadoop.http.HttpServer2.bindListener(HttpServer2.java:988)
at org.apache.hadoop.http.HttpServer2.bindForSinglePort(HttpServer2.java:1019)
... 9 more
2020-07-18 21:26:22,755 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2020-07-18 21:26:22,757 INFO namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at
************************************************************/
[root@linux hdfs]#
[root@linux hdfs]#
[root@linux hdfs]# netstat -tulpn | grep 50070 ( no PID number is returned )
The messages indicate the problem is with a HTTP server that belongs to Hadoop. Port 50070 is the default for HDFS web UI in Hadoop 2.7.1 and above, I think.
With netstat -tulpn
, you are looking at ports listening for incoming TCP connections. And since the problem is with Hadoop's HTTP server, you don't need to look at UDP ports at all, since HTTP will only use a TCP port.
But since the port number is so high, it could be occupied by an outgoing connection instead. Try netstat -tapn | grep 50070
instead.
To see the range of ports that can be dynamically allocated for outgoing connections, run cat /proc/sys/net/ipv4/ip_local_port_range
. You can use net.ipv4.ip_local_port_range = min_value max_value
in /etc/sysctl.conf[.d]
to adjust the range, but restricting the range on a busy server that has a lot of outgoing connections might not be a good idea. The default range on my Debian 10 is from port 32768 to 60999; enterprise distributions may use an expanded range by default.
Instead, you might want to choose a non-default port for this HDFS web UI, that is outside the range of ports used for outgoing connections. The property dfs.namenode.http-address
in hdfs.xml
has a default value of 0.0.0.0:50070
if it is not set. You could set that property with a value of 0.0.0.0:<some_other_port>
.
In other words, to set the port number to e.g. 32070 instead, you could add this to your hdfs.xml
:
<property>
<name>dfs.namenode.http-address</name>
<value>0.0.0.0:32070</value>
</property>
The 0.0.0.0
means "any IP address the system running the web UI has." You could replace it with an IP address, if the system has multiple network connections with different addresses and you want the HDFS web UI to be reachable in a single IP address only.
Of course, you'll also need to document that the HDFS web UI is now in a non-default port, so that the administrators who need the web UI functionality will be able to find it.
Answered by telcoM on November 14, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP