irpas技术客

Could not retrieve the web interface URL for the cluster 错误问题解决_yfqfy

网络投稿 1809

问题描述:

? ? ??org.apache.flink.client.program.rest.RestClusterClient:Could not retrieve the web interface URL for the cluster.

详细日志如下

Exception in thread "main" java.util.concurrent.ExecutionException: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph. at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at com.dtstack.flinkx.launcher.Launcher.main(Launcher.java:131) Caused by: org.apache.flink.runtime.client.JobSubmissionException: Failed to submit JobGraph. at org.apache.flink.client.program.rest.RestClusterClient.lambda$submitJob$7(RestClusterClient.java:400) at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:870) at java.util.concurrent.CompletableFuture$UniExceptionally.tryFire(CompletableFuture.java:852) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1595) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.TimeoutException at org.apache.flink.runtime.concurrent.FutureUtils$Timeout.run(FutureUtils.java:1255) at org.apache.flink.runtime.concurrent.DirectExecutorService.execute(DirectExecutorService.java:217) at org.apache.flink.runtime.concurrent.FutureUtils.lambda$orTimeout$15(FutureUtils.java:582) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) at java.util.co

下载flink 1.13进行编译(注意一定要全部编译,如果单独编译可能会出现各种问题):

在RestClusterClient假如日志:

CompletableFuture<URL> getWebMonitorBaseUrl() { LOG.info( "------------------getWebMonitorBaseUrl {}, {},",restClusterClientConfiguration.getAwaitLeaderTimeout() , TimeUnit.MILLISECONDS); return FutureUtils.orTimeout( webMonitorLeaderRetriever.getLeaderFuture(), restClusterClientConfiguration.getAwaitLeaderTimeout(), TimeUnit.MILLISECONDS) .thenApplyAsync( leaderAddressSessionId -> { final String url = leaderAddressSessionId.f0; LOG.info("------------------getWebMonitorBaseUrl url is {}", url); try { return new URL(url); } catch (MalformedURLException e) { throw new IllegalArgumentException( "Could not parse URL from " + url, e); } }, executorService); }

注意如果报下面错误:

Failed to execute goal com.diffplug.spotless:spotless-maven-plugin:2.4.2:check (spotless-check) on project flink-clients_2.11: The following files had format violations: src\main\java\org\apache\flink\client\program\rest\RestClusterClient.java @@ -1,895 +1,895 @@ -/*\n

需要使用?mvn spotless:apply 先格式化一下代码

编译后将含有日志的打印出来,效果如下:

可以看到超时时间是30秒,这个是正常情况.

接着往下定位:

异常环境中为false,可能是因为调度方式的问题。

初步原因是这两个地方

org.apache.flink.runtime.concurrent.FutureUtils中的orTimeOut方法。

返回标识此CompletableFuture的字符串及其完成状态。括号中的状态包含字符串 “Completed Normally”(“正常完成”)或字符串 “Completed Exceptionally”(“异常完成”),或字符串 “Not completed”(“未完成”),其后是取决于完成情况的CompletableFuture数量(如果有)。

引起此原因更深入的原因,查找完成后,看后续文章。

原因已经定位:

此问题为 flink启动yarn-session.sh方式,但是flink-conf.yaml配置文件中没有配置zookeeper高可用。

配置后重启 flink问题解决

?

?


1.本站遵循行业规范,任何转载的稿件都会明确标注作者和来源;2.本站的原创文章,会注明原创字样,如未注明都非原创,如有侵权请联系删除!;3.作者投稿可能会经我们编辑修改或补充;4.本站不提供任何储存功能只提供收集或者投稿人的网盘链接。

标签: #could #not #Retrieve #The #Web #Interface #url #for