hadoop - Spark 1.3.0 on YARN: Application failed 2 times due to AM Container -


when running spark 1.3.0 pi example on yarn (hadoop 2.6.0.2.2.0.0-2041) following script:

# run on yarn cluster export hadoop_conf_dir=/etc/hadoop/conf /var/home2/test/spark/bin/spark-submit \ --class org.apache.spark.examples.sparkpi \ --master yarn-cluster \ --executor-memory 3g \ --num-executors 50 \ /var/home2/test/spark/lib/spark-examples-1.3.0-hadoop2.4.0.jar \ 1000 

it fails "application failed 2 times due container" message (please see below). far understand, neccessary information run spark application in yarn mode provided in launch script. else should configured run on yarn. missing? other reasons yarn launch fail?

[test@etl-hdp-mgmt pi]$ ./run-pi.sh spark assembly has been built hive, including datanucleus jars on classpath  15/04/01 12:59:57 warn util.nativecodeloader: unable load native-hadoop library platform... using builtin-java classes applicable 15/04/01 12:59:58 info client.rmproxy: connecting resourcemanager @ etl-hdp-yarn.foo.bar.com/192.168.0.16:8050 15/04/01 12:59:58 info yarn.client: requesting new application cluster 4 nodemanagers 15/04/01 12:59:58 info yarn.client: verifying our application has not requested more maximum memory capability of cluster (4096 mb per container) 15/04/01 12:59:58 info yarn.client: allocate container, 896 mb memory including 384 mb overhead 15/04/01 12:59:58 info yarn.client: setting container launch context our 15/04/01 12:59:58 info yarn.client: preparing resources our container 15/04/01 12:59:59 warn hdfs.blockreaderlocal: short-circuit local reads feature cannot used because libhadoop cannot loaded. 15/04/01 12:59:59 info yarn.client: uploading resource file:/var/home2/test/spark-1.3.0-bin-hadoop2.4/lib/spark-assembly-1.3.0-hadoop2.4.0.jar -> hdfs://foo.bar.com:8020/user/test/.sparkstaging/application_1427875242006_0010/spark-assembly-1.3.0-hadoop2.4.0.jar 15/04/01 13:00:01 info yarn.client: uploading resource file:/var/home2/test/spark/lib/spark-examples-1.3.0-hadoop2.4.0.jar -> hdfs://foo.bar.com:8020/user/test/.sparkstaging/application_1427875242006_0010/spark-examples-1.3.0-hadoop2.4.0.jar 15/04/01 13:00:02 info yarn.client: setting launch environment our container 15/04/01 13:00:03 info spark.securitymanager: changing view acls to: test 15/04/01 13:00:03 info spark.securitymanager: changing modify acls to: test 15/04/01 13:00:03 info spark.securitymanager: securitymanager: authentication disabled; ui acls disabled; users view permissions: set(test); users modify permissions: set(test) 15/04/01 13:00:03 info yarn.client: submitting application 10 resourcemanager 15/04/01 13:00:03 info impl.yarnclientimpl: submitted application application_1427875242006_0010 15/04/01 13:00:04 info yarn.client: application report application_1427875242006_0010 (state: accepted) 15/04/01 13:00:04 info yarn.client:       client token: n/a      diagnostics: n/a      applicationmaster host: n/a      applicationmaster rpc port: -1      queue: default      start time: 1427893202566      final status: undefined      tracking url: http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0010/      user: test 15/04/01 13:00:05 info yarn.client: application report application_1427875242006_0010 (state: accepted) 15/04/01 13:00:06 info yarn.client: application report application_1427875242006_0010 (state: accepted) 15/04/01 13:00:07 info yarn.client: application report application_1427875242006_0010 (state: accepted) 15/04/01 13:00:08 info yarn.client: application report application_1427875242006_0010 (state: accepted) 15/04/01 13:00:09 info yarn.client: application report application_1427875242006_0010 (state: failed) 15/04/01 13:00:09 info yarn.client:       client token: n/a      diagnostics: application application_1427875242006_0010 failed 2 times due container appattempt_1427875242006_0010_000002 exited  exitcode: 1 more detailed output, check application tracking page:http://etl-hdp-yarn.foo.bar.com:8088/proxy/application_1427875242006_0010/then, click on links logs of each attempt. diagnostics: exception container-launch. container id: container_1427875242006_0010_02_000001 exit code: 1 exception message: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0010/container_1427875242006_0010_02_000001/launch_container.sh: line 27: $pwd:$pwd/__spark__.jar:$hadoop_conf_dir:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$pwd/mr-framework/hadoop/share/hadoop/mapreduce/*:$pwd/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$pwd/mr-framework/hadoop/share/hadoop/common/*:$pwd/mr-framework/hadoop/share/hadoop/common/lib/*:$pwd/mr-framework/hadoop/share/hadoop/yarn/*:$pwd/mr-framework/hadoop/share/hadoop/yarn/lib/*:$pwd/mr-framework/hadoop/share/hadoop/hdfs/*:$pwd/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution  stack trace: exitcodeexception exitcode=1: /mnt/hdfs01/hadoop/yarn/local/usercache/test/appcache/application_1427875242006_0010/container_1427875242006_0010_02_000001/launch_container.sh: line 27: $pwd:$pwd/__spark__.jar:$hadoop_conf_dir:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$pwd/mr-framework/hadoop/share/hadoop/mapreduce/*:$pwd/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$pwd/mr-framework/hadoop/share/hadoop/common/*:$pwd/mr-framework/hadoop/share/hadoop/common/lib/*:$pwd/mr-framework/hadoop/share/hadoop/yarn/*:$pwd/mr-framework/hadoop/share/hadoop/yarn/lib/*:$pwd/mr-framework/hadoop/share/hadoop/hdfs/*:$pwd/mr-framework/hadoop/share/hadoop/hdfs/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution      @ org.apache.hadoop.util.shell.runcommand(shell.java:538)     @ org.apache.hadoop.util.shell.run(shell.java:455)     @ org.apache.hadoop.util.shell$shellcommandexecutor.execute(shell.java:715)     @ org.apache.hadoop.yarn.server.nodemanager.defaultcontainerexecutor.launchcontainer(defaultcontainerexecutor.java:211)     @ org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.containerlaunch.call(containerlaunch.java:302)     @ org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.containerlaunch.call(containerlaunch.java:82)     @ java.util.concurrent.futuretask.run(futuretask.java:262)     @ java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor.java:1145)     @ java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor.java:615)     @ java.lang.thread.run(thread.java:745)   container exited non-zero exit code 1 failing attempt. failing application.      applicationmaster host: n/a      applicationmaster rpc port: -1      queue: default      start time: 1427893202566      final status: failed      tracking url: http://etl-hdp-yarn.foo.bar.com:8088/cluster/app/application_1427875242006_0010      user: test exception in thread "main" org.apache.spark.sparkexception: application finished failed status     @ org.apache.spark.deploy.yarn.client.run(client.scala:622)     @ org.apache.spark.deploy.yarn.client$.main(client.scala:647)     @ org.apache.spark.deploy.yarn.client.main(client.scala)     @ sun.reflect.nativemethodaccessorimpl.invoke0(native method)     @ sun.reflect.nativemethodaccessorimpl.invoke(nativemethodaccessorimpl.java:57)     @ sun.reflect.delegatingmethodaccessorimpl.invoke(delegatingmethodaccessorimpl.java:43)     @ java.lang.reflect.method.invoke(method.java:606)     @ org.apache.spark.deploy.sparksubmit$.org$apache$spark$deploy$sparksubmit$$runmain(sparksubmit.scala:569)     @ org.apache.spark.deploy.sparksubmit$.dorunmain$1(sparksubmit.scala:166)     @ org.apache.spark.deploy.sparksubmit$.submit(sparksubmit.scala:189)     @ org.apache.spark.deploy.sparksubmit$.main(sparksubmit.scala:110)     @ org.apache.spark.deploy.sparksubmit.main(sparksubmit.scala) 

i totally agree @seanowen. follow spark building documentation.

you need compile spark yarn using correct configuration hadoop cluster (version,hive support, etc).

the problem won't persist then!


Comments

Popular posts from this blog

tcpdump - How to check if server received packet (acknowledged) -