十四、unable to find local peer 172.16.26.250:8848

问题描述

当我在虚拟机搭建成集群之后,(单机的虚拟机需要注意配置局域网IP而不要配置127.0.0.1)。
打算拿三台实际的云服务器来搭建一台可以用于生产的Nacos集群.

但是遇到了一些问题:主要异常如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
java.lang.IllegalStateException: unable to find local peer: 172.16.26.250:8848, all peers: [120.79.167.88:8848, 119.23.104.130:8848, 47.101.47.127:8848]
at com.alibaba.nacos.naming.consistency.persistent.raft.RaftPeerSet.local(RaftPeerSet.java:224)
at com.alibaba.nacos.naming.monitor.PerformanceLoggerThread.collectMetrics(PerformanceLoggerThread.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:84)
at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:93)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-10-18 14:06:45,000 ERROR Unexpected error occurred in scheduled task.

这个日志是从logs/nacos.log中查询到的。我们在部署的时候,可以关注一下下列三个日志文件:

1
2
3
logs/nacos.log
logs/naming-raft.log
logs/start.out

主要是查看nacos.log日志。一开始启动三个节点,均没有出错,但是在控制页面的节点列表中一直无法显示出节点信息。所以我怀疑没有真正搭建成功,于是从网上得知查询启动日志的文件,如上。发现nacos读取的是内网IP,但是在集群列表中不存在这个IP,所以报异常。网上大部分的集群ip都是在一个网段的,直接配置外网ip都是可以搭建成功,我按照这个方式一致搭建不成功。后面参考了下面这篇文章,修改启动配置,才得以搭建成功。

https://www.wandouip.com/t5i278697/

结果如下:

解决方案

修改每个节点的startup.sh启动文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#===========================================================================================
# JVM Configuration
#===========================================================================================
if [[ "${MODE}" == "standalone" ]]; then
JAVA_OPT="${JAVA_OPT} -Xms512m -Xmx512m -Xmn256m"
JAVA_OPT="${JAVA_OPT} -Dnacos.standalone=true"
else
JAVA_OPT="${JAVA_OPT} -server -Xms512m -Xmx512m -Xmn256m -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=320m"
JAVA_OPT="${JAVA_OPT} -XX:-OmitStackTraceInFastThrow -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=${BASE_DIR}/logs/java_heapdump.hprof"
JAVA_OPT="${JAVA_OPT} -XX:-UseLargePages"
JAVA_OPT="${JAVA_OPT} -Dnacos.server.ip=120.79.167.88"
fi

if [[ "${FUNCTION_MODE}" == "config" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.functionMode=config"
elif [[ "${FUNCTION_MODE}" == "naming" ]]; then
JAVA_OPT="${JAVA_OPT} -Dnacos.functionMode=naming"
fi

添加这一行:JAVA_OPT="${JAVA_OPT} -Dnacos.server.ip=120.79.167.88" ip分别改为对应主机的外网IP即可。

我们也查询一下源码:

在config模块下,utils/SystemConfig.java中。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
public class SystemConfig {
public static final String LOCAL_IP = getHostAddress();
private static final Logger log = LoggerFactory.getLogger(SystemConfig.class);
private static String getHostAddress() {
String address = System.getProperty("nacos.server.ip");
if (StringUtils.isNotEmpty(address)) {
return address;
} else {
address = "127.0.0.1";
}
try {
Enumeration<NetworkInterface> en = NetworkInterface.getNetworkInterfaces();
while (en.hasMoreElements()) {
NetworkInterface ni = en.nextElement();
Enumeration<InetAddress> ads = ni.getInetAddresses();
while (ads.hasMoreElements()) {
InetAddress ip = ads.nextElement();
// 兼容集团不规范11网段
if (!ip.isLoopbackAddress()
&& ip.getHostAddress().indexOf(":") == -1
/* && ip.isSiteLocalAddress() */) {
return ip.getHostAddress();
}
}
}
} catch (Exception e) {
log.error("get local host address error", e);
}
return address;
}
}

读取nacos.server.ip 的值。

停节点

将130节点停止之后,会选择一个先的leader节点 88.

我们再把130节点启动起来:可以方式,130节点并不会恢复之前的leader节点,而是变为了follower节点。

#

评论

Your browser is out-of-date!

Update your browser to view this website correctly. Update my browser now

×