irpas技术客

Ambari2.75集成flink-1.14.4_码道功成_ambari集成flink

irpas 8421

? ? ? ? ambari集成flink详情可参考:Ambari 2.7.5安装Flink1.13.2_不饿同学的博客-CSDN博客_ambari安装flink

这里说一下安装过程遇到的问题:

1、安装时报错:Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-710.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-710.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']

解决办法:

cd /var/lib/ambari-server/resources/scripts python configs.py -u admin -p admin -n zsCluster -l hadoop01 -t 8080 -a set -c cluster-env -k ignore_groupsusers_create -v ture 2022-03-25 09:38:14,125 INFO ### Performing "set": 2022-03-25 09:38:14,125 INFO ### new property - "ignore_groupsusers_create":"ture" 2022-03-25 09:38:14,156 INFO ### on (Site:cluster-env, Tag:d88402a7-13d7-4e4d-8d8d-e8007e1319e8) 2022-03-25 09:38:14,170 INFO ### PUTting json into: doSet_version1648172294170011.json 2022-03-25 09:38:14,403 INFO ### NEW Site:cluster-env, Tag:version1648172294170011

其中,zsCluster替换为自己的集群名称;hadoop01为ambari-server所在机器的主机名。

2、安装时报错:KeyError: 'getpwnam(): name not found: flink',...resource_management.core.exceptions.Fail: User 'flink' doesn't exist

Traceback (most recent call last): File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 51, in _ensure_metadata _user_entity = pwd.getpwnam(user) KeyError: 'getpwnam(): name not found: flink' The above exception was the cause of the following exception: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stacks/HDP/3.1/services/FLINK/package/scripts/flink.py", line 172, in <module> Master().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute method(env) File "/var/lib/ambari-agent/cache/stacks/HDP/3.1/services/FLINK/package/scripts/flink.py", line 30, in install group=params.flink_group File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 125, in __new__ cls(names_list.pop(0), env, provider, **kwargs) File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 199, in action_create recursion_follow_links=self.resource.recursion_follow_links, safemode_folders=self.resource.safemode_folders) File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 53, in _ensure_metadata raise Fail("User '{0}' doesn't exist".format(user)) resource_management.core.exceptions.Fail: User 'flink' doesn't exist

解决办法:在需要安装flink的服务器上执行如下命令:

?useradd ?-d /home/flink ?-g flink flink

?3、启动时报错:JobManager memory configuration failed: Sum of configured JVM Metaspace (256.000mb (268435456 bytes)) and JVM Overhead (192.000mb (201326592 bytes)) exceed configured Total Process Memory (256.000mb (268435456 bytes))

------------------------------------------------------------ The program finished with the following exception: org.apache.flink.configuration.IllegalConfigurationException: JobManager memory configuration failed: Sum of configured JVM Metaspace (256.000mb (268435456 bytes)) and JVM Overhead (192.000mb (201326592 bytes)) exceed configured Total Process Memory (256.000mb (268435456 bytes)). at org.apache.flink.runtime.jobmanager.JobManagerProcessUtils.processSpecFromConfigWithNewOptionToInterpretLegacyHeap(JobManagerProcessUtils.java:78) at org.apache.flink.client.deployment.AbstractContainerizedClusterClientFactory.getClusterSpecification(AbstractContainerizedClusterClientFactory.java:43) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:602) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:860) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:860) Caused by: org.apache.flink.configuration.IllegalConfigurationException: Sum of configured JVM Metaspace (256.000mb (268435456 bytes)) and JVM Overhead (192.000mb (201326592 bytes)) exceed configured Total Process Memory (256.000mb (268435456 bytes)). at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.deriveJvmMetaspaceAndOverheadWithTotalProcessMemory(ProcessMemoryUtils.java:157) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.deriveProcessSpecWithTotalProcessMemory(ProcessMemoryUtils.java:114) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:84) at org.apache.flink.runtime.jobmanager.JobManagerProcessUtils.processSpecFromConfig(JobManagerProcessUtils.java:83) at org.apache.flink.runtime.jobmanager.JobManagerProcessUtils.processSpecFromConfigWithNewOptionToInterpretLegacyHeap(JobManagerProcessUtils.java:73) ... 8 more

排查过程:通过官网查看文档:

第一步:搜索错误

?第二步:点击“配置信息”

文档说,大多数情况下,只需设置taskmanager.memory.process.size 和 taskmanager.memory.flink.size,然后通过taskmanager.memory.managed.fraction调整jvm堆内存和管理内存的比例。所以就很郁闷,人家都说了通过调整这两个参数就行了,那为啥就启动不了呢?于是乎,赶紧找这两个参数,但是发现没有上面提到的参数,只有这两个:

?

官网显示,这两个参数1.11版本开始已经弃用了了了!强调一下我用的flink是最新版(1.14.4)

那就赶紧删除了这两个参数 ,然后再启动试试!然后就报了下面的错误:

------------------------------------------------------------ The program finished with the following exception: org.apache.flink.configuration.IllegalConfigurationException: JobManager memory configuration failed: Either required fine-grained memory (jobmanager.memory.heap.size), or Total Flink Memory size (Key: 'jobmanager.memory.flink.size' , default: null (fallback keys: [])), or Total Process Memory size (Key: 'jobmanager.memory.process.size' , default: null (fallback keys: [])) need to be configured explicitly. at org.apache.flink.runtime.jobmanager.JobManagerProcessUtils.processSpecFromConfigWithNewOptionToInterpretLegacyHeap(JobManagerProcessUtils.java:78) at org.apache.flink.client.deployment.AbstractContainerizedClusterClientFactory.getClusterSpecification(AbstractContainerizedClusterClientFactory.java:43) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:602) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:860) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:860) Caused by: org.apache.flink.configuration.IllegalConfigurationException: Either required fine-grained memory (jobmanager.memory.heap.size), or Total Flink Memory size (Key: 'jobmanager.memory.flink.size' , default: null (fallback keys: [])), or Total Process Memory size (Key: 'jobmanager.memory.process.size' , default: null (fallback keys: [])) need to be configured explicitly. at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.failBecauseRequiredOptionsNotConfigured(ProcessMemoryUtils.java:129) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:86) at org.apache.flink.runtime.jobmanager.JobManagerProcessUtils.processSpecFromConfig(JobManagerProcessUtils.java:83) at org.apache.flink.runtime.jobmanager.JobManagerProcessUtils.processSpecFromConfigWithNewOptionToInterpretLegacyHeap(JobManagerProcessUtils.java:73) ... 8 more

关键错误信息:JobManager memory configuration failed: Either required fine-grained memory (jobmanager.memory.heap.size), or Total Flink Memory size (Key: 'jobmanager.memory.flink.size' , default: null (fallback keys: [])), or Total Process Memory size (Key: 'jobmanager.memory.process.size' , default: null (fallback keys: [])) need to be configured explicitly.

翻译之后:JobManager内存配置失败:需要显式配置所需的细粒度内存(jobmanager.memory.heap.size),或总Flink内存大小(Key:' job manager.Memory.Flink.size',默认值:null(回退键:[])),或总进程内存大小(Key:' job manager.Memory.Process.size ',默认值:null(回退键:[]))。

那这就很明朗了,就是要指定jobmanager.memory.heap.size、job manager.Memory.Flink.size、job manager.Memory.Process.size这三个的任意一个;那这三个分别代表什么意思呢?

去官网查阅这三个参数的过程意外发现了:

这更加证明我们的大方向是对的,以下是这三个参数的解释:?

jobmanager.memory.flink.size:JobManager的总Flink内存大小。这包括作业管理器消耗的所有内存,JVM元空间和JVM开销除外。它由JVM堆内存和堆外内存组成. jobmanager.memory.heap.size:JobManager的JVM堆内存大小。建议的最小JVM堆大小是128.000mb. jobmanager.memory.process.size:JobManager的总进程内存大小。这包括JobManager JVM进程消耗的所有内存,包括总Flink内存、JVM元空间和JVM开销。在容器化设置中,这应该设置为容器存储器。

感觉还是很迷糊,得结合这张图看:

?从图中可以看出,jobmanager.memory.flink.size是最大的那个,设置好它之后,flink会进一步自行划分,所以那就设置这个参数:jobmanager.memory.flink.size: 1024m;然后启动,又又报了下面的错误:

------------------------------------------------------------ The program finished with the following exception: org.apache.flink.configuration.IllegalConfigurationException: TaskManager memory configuration failed: Either required fine-grained memory (taskmanager.memory.task.heap.size and taskmanager.memory.managed.size), or Total Flink Memory size (Key: 'taskmanager.memory.flink.size' , default: null (fallback keys: [])), or Total Process Memory size (Key: 'taskmanager.memory.process.size' , default: null (fallback keys: [])) need to be configured explicitly. at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:163) at org.apache.flink.client.deployment.AbstractContainerizedClusterClientFactory.getClusterSpecification(AbstractContainerizedClusterClientFactory.java:49) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:602) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$4(FlinkYarnSessionCli.java:860) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:860) Caused by: org.apache.flink.configuration.IllegalConfigurationException: Either required fine-grained memory (taskmanager.memory.task.heap.size and taskmanager.memory.managed.size), or Total Flink Memory size (Key: 'taskmanager.memory.flink.size' , default: null (fallback keys: [])), or Total Process Memory size (Key: 'taskmanager.memory.process.size' , default: null (fallback keys: [])) need to be configured explicitly. at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.failBecauseRequiredOptionsNotConfigured(ProcessMemoryUtils.java:129) at org.apache.flink.runtime.util.config.memory.ProcessMemoryUtils.memoryProcessSpecFromConfig(ProcessMemoryUtils.java:86) at org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils.processSpecFromConfig(TaskExecutorProcessUtils.java:160) ... 8 more

这次轮到TaskManager了,它的错误跟JobManager上面的错误类似,就是那三个参数至少得设置一个,那我们同样设置:taskmanager.memory.flink.size: 1024m。

这次终于起来了!!大功告成!

这个地方启动不了,网上有别的处理方法,要么不生效,要么比较繁琐。授人以鱼不如授人以渔,这才是解决问题的正确打开方式,所以还得是官网呀!


1.本站遵循行业规范,任何转载的稿件都会明确标注作者和来源;2.本站的原创文章,会注明原创字样,如未注明都非原创,如有侵权请联系删除!;3.作者投稿可能会经我们编辑修改或补充;4.本站不提供任何储存功能只提供收集或者投稿人的网盘链接。

标签: #ambari集成flink #error #Unable #To #run #The