Skip to content
This repository was archived by the owner on Nov 16, 2019. It is now read-only.
This repository was archived by the owner on Nov 16, 2019. It is now read-only.

Exception in thread "main" org.apache.spark.SparkException: Application application_ finished with failed status #247  #254

@libeiUCAS

Description

@libeiUCAS

I get the same problem as #247 ,and change change the source location in lenet_memory_train_test to the hdfs path as @arundasan91 's suggestion.
However ,i still meet the same problem.
////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
17/05/03 18:50:15 INFO yarn.Client: Application report for application_1493801577689_0009 (state: RUNNING) 17/05/03 18:50:16 INFO yarn.Client: Application report for application_1493801577689_0009 (state: FINISHED) 17/05/03 18:50:16 INFO yarn.Client: client token: N/A diagnostics: N/A ApplicationMaster host: 192.168.191.3 ApplicationMaster RPC port: 0 queue: default start time: 1493808511908 final status: FAILED tracking URL: http://sky:8088/proxy/application_1493801577689_0009/ user: hadoop Exception in thread "main" org.apache.spark.SparkException: Application application_1493801577689_0009 finished with failed status at org.apache.spark.deploy.yarn.Client.run(Client.scala:1029) at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1076) at org.apache.spark.deploy.yarn.Client.main(Client.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 17/05/03 18:50:16 INFO util.ShutdownHookManager: Shutdown hook called 17/05/03 18:50:16 INFO util.ShutdownHookManager: Deleting directory /home/hadoop/deep_learning/spark-1.6.0-bin-hadoop2.6/spark-2136e9ab-1b64-4d32-85d0-a6eb6fce0ea1
/////////////////////////////////////////////////////////////////////////////////////////////////////////
I have two machines . IP 192.168.191.2 is master ,32GB 8cores. IP 192.168.191.3 is slave 32GB 8cores.

as step 8 say: export SPARK_WORKER_INSTANCES=2 export DEVICES=1
The error in logpage is "Diagnostics:User class threw exception: java.lang.IllegalStateException: actual number of executors is not as expected"

when i change the command "export SPARK_WORKER_INSTANCES=2 export DEVICES=2"
The error in logpage is
"Diagnostics: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 7, trc): ExecutorLostFailure (executor 4 exited caused by one of the running tasks) Reason: Container killed by YARN for exceeding memory limits. 46.3 GB of 4.2 GB virtual memory used. Consider boosting spark.yarn.executor.memoryOverhead.
Driver stacktrace:"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions