FlinkSQL报错:org.apache.flink.util.FlinkException: Could not upload job files.

1、首先我使用的Flink版本

Flink1.12.0

2、出现错误场景

在进行Flink和Hive(3.1.2)版本进行集成,通过sql-client.sh embedded来执行(select * from emp)语句时出现此错误信息

—> 报错信息

FlinkSQL报错:org.apache.flink.util.FlinkException: Could not upload job files.

—> 分析

org.apache.flink.util.FlinkException: Could not upload job files

这个错误较为笼统,根据这个定位会很模糊,值得注意的是下面的报错:

2022-03-31 14:50:10,856 INFO  org.apache.flink.runtime.net.ConnectionUtils                 [] - Trying to connect to redhat113/192.168.0.113:6123
2022-03-31 14:50:10,857 INFO  org.apache.flink.runtime.net.ConnectionUtils                 [] - Failed to connect from address 'redhat114/192.168.0.114': Connection refused (Connection refused)

同样这个错误出现首先涉及的是Hdfs上Flink的HA目录,意思Flink无法在Hdfs上进行冗余复制,因为此时集群中datanode副本数为0个。(这条信息是后续排查出来,并非直接从报错信息得到)

1、因为启动了sql-client.sh embedded,所以先找到对应日志

  • 目录:/opt/modules/flink-1.12.1/log
  • less flink-xxx-sql-client-redhat113.log

FlinkSQL报错:org.apache.flink.util.FlinkException: Could not upload job files.
2022-03-31 14:50:10,856 INFO  org.apache.flink.runtime.net.ConnectionUtils                 [] - Trying to connect to redhat113/192.168.0.113:6123
2022-03-31 14:50:10,857 INFO  org.apache.flink.runtime.net.ConnectionUtils                 [] - Failed to connect from address 'redhat114/192.168.0.114': Connection refused (Connection refused)

这个错误首先排查认为是Flink的rpc端口6123没有打开,连接不上,根据网络具有相同错误的文章需要更改/opt/modules/flink-1.12.1/conf/ flink-conf.yaml中加入taskmanager.host配置

  • taskmanager.host: localhost

但是,我的flink搭建的是集群不是单节点,况且有设置workers(/opt/modules/flink-1.12.1/conf/workers)信息,所以此解决办法对我的错误无效。

之后我更改了flink配置信息:/opt/modules/flink-1.12.1/conf/ flink-conf.yaml

  • jobmanager.rpc.port: 6124

【注意】更改配置文件需要重新启动Flink集群

依然报错,但是排除掉了6123端口出错可能

2022-03-31 14:50:10,856 INFO  org.apache.flink.runtime.net.ConnectionUtils                 [] - Trying to connect to redhat113/192.168.0.113:6124
2022-03-31 14:50:10,857 INFO  org.apache.flink.runtime.net.ConnectionUtils                 [] - Failed to connect from address 'redhat114/192.168.0.114': Connection refused (Connection refused)

2、查看/opt/modules/flink-1.12.1/log下最新的日志

flink-xxx-standalonesession-2-redhat113.log

2022-03-31 10:41:42,054 WARN  org.apache.hadoop.hdfs.DataStreamer                          [] - DataStreamer Exception
org.apache.hadoop.ipc.RemoteException: File /flink/ha/default/blob/job_e2beb46ed40ee43728f876db39bbd834/blob_p-c999b9fe3580ad218137b6358353f184ee5007e7-124863f53db53f261a17d39a2c212c9e could only be written to 0 of the 1 minReplication nodes. There are 0 datanode(s) running and 0 node(s) are excluded in this operation.

        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2312)
        at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2939)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:908)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:593)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:532)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)

        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1553) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.Client.call(Client.java:1499) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.Client.call(Client.java:1396) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at com.sun.proxy.$Proxy26.addBlock(Unknown Source) ~[?:?]
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_291]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_291]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_291]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_291]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at com.sun.proxy.$Proxy27.addBlock(Unknown Source) ~[?:?]
        at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1085) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1866) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716) [flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
2022-03-31 10:41:42,063 ERROR org.apache.flink.runtime.blob.BlobServerConnection           [] - PUT operation failed
org.apache.hadoop.ipc.RemoteException: File /flink/ha/default/blob/job_e2beb46ed40ee43728f876db39bbd834/blob_p-c999b9fe3580ad218137b6358353f184ee5007e7-124863f53db53f261a17d39a2c212c9e could only be written to 0 of the 1 minReplication nodes. There are 0 datanode(s) running and 0 node(s) are excluded in this operation.

        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2312)
        at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2939)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:908)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:593)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:532)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)

        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1553) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.Client.call(Client.java:1499) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.Client.call(Client.java:1396) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at com.sun.proxy.$Proxy26.addBlock(Unknown Source) ~[?:?]
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_291]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_291]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_291]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_291]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at com.sun.proxy.$Proxy27.addBlock(Unknown Source) ~[?:?]
        at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1085) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1866) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]

提取日志,发现和sql-client.sh报错日志信息一致

org.apache.hadoop.ipc.RemoteException: File /flink/ha/default/blob/job_e2beb46ed40ee43728f876db39bbd834/blob_p-c999b9fe3580ad218137b6358353f184ee5007e7-124863f53db53f261a17d39a2c212c9e co
uld only be written to 0 of the 1 minReplication nodes. There are 0 datanode(s) running and 0 node(s) are excluded in this operation.

问题还是不够明显,于是尝试提交一个flink集群任务

  • flink run /opt/modules/flink-1.12.1/examples/batch/WordCount.jar
2022-03-31 14:57:53,980 WARN  org.apache.hadoop.hdfs.DataStreamer                          [] - DataStreamer Exception
org.apache.hadoop.ipc.RemoteException: File /user/liuxiaoyu/.flink/application_1648691603321_0001/log4j.properties could only be written to 0 of the 1 minReplication nodes. There are 0 datanode(s) running and 0 node(s) are excluded in this operation.

        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2312)
        at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2939)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:908)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:593)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:532)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)

        at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1553) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.Client.call(Client.java:1499) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.Client.call(Client.java:1396) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at com.sun.proxy.$Proxy29.addBlock(Unknown Source) ~[?:?]
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:520) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_291]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_291]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_291]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_291]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at com.sun.proxy.$Proxy30.addBlock(Unknown Source) ~[?:?]
        at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1085) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1866) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1668) ~[flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]
        at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:716) [flink-shaded-hadoop-3-uber-3.1.1.7.1.1.0-565-9.0.jar:3.1.1.7.1.1.0-565-9.0]

依然报错,但是将排查范围缩小到了flink集群上,排除掉了flinkSQL配置引发的问题,同样FlinkSQL可以启动,那么与Hive集成的配置信息也无关

3、接下来因为涉及到了Hdfs的存储情况,所以还需要查看一下Hadoop的日志情况

  • less /opt/modules/hadoop-3.3.0/logs/hadoop-liuxiaoyu-namenode-redhat113.log

同样会发现错误信息

org.apache.hadoop.ipc.RemoteException: File /flink/ha/default/blob/job_e2beb46ed40ee43728f876db39bbd834/blob_p-c999b9fe3580ad218137b6358353f184ee5007e7-124863f53db53f261a17d39a2c212c9e co
uld only be written to 0 of the 1 minReplication nodes. There are 0 datanode(s) running and 0 node(s) are excluded in this operation.

【总结】于是综合上述情况,与Flink集群有关,都涉及到了HDFS问题,而且与副本数有关,所以可以断定hadoop的集群节点DataNode有问题。经过一番排查发现Hadoop集群中缺少DataNode节点。解决方法:删除Hadoop的集群中主节点的数据文件,然后删除节点中的Hadoop文件,格式化NameNode后重新分发节点即可。

【出错原因】因为Flink在和Hive集成过程中,需要修改hadoop的配置文件core-site.xml(/opt/modules/hadoop-3.3.0/etc/hadoop/),但是修改文件后没有删除掉hadoop配置项中自定义的data和hdfs下的文件导致直接执行:hdfs namenode -format, DataNode没有启动。

<property>
    <name>hadoop.proxyuser.xxx.hosts</name>
    <value>*</value>
</property>
<property>
    <name>hadoop.proxyuser.xxx.groups</name>
    <value>*</value>
</property>

4、补充

这个错误报错信息篇幅较长,直接截取很难搜索到解决的方法,有时即使找到同样的错误,但是发现解决方法也不尽相同,所以在配置文件的时候,值得注意的是避免因为配置另一个组件,导致其他已经配置的依赖组件信息发生改变。如:在我排错过程中忘记对flink配置文件中的zoo.cfg进行配置,然而报错结果依然是上述错误,不易排查。

Original: https://blog.csdn.net/yuchendejiyi/article/details/123873602
Author: 不懂书童
Title: FlinkSQL报错:org.apache.flink.util.FlinkException: Could not upload job files.

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/817983/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球