I follow the instructions: download the project and use build/sbt assembly and then I execute the python/run-tests.sh, but it gives me the following info:
List of assembly jars found, the last one will be used:
ls: /Users/lei.cui/Documents/Workspace/DeepLearninginApacheSpark/spark-deep-learning-master/python/../target/scala-2.12/spark-deep-learning-assembly*.jar: No such file or directory
============= Searching for tests in: /Users/lei.cui/Documents/Workspace/DeepLearninginApacheSpark/spark-deep-learning-master/python/tests =============
============= Running the tests in: /Users/lei.cui/Documents/Workspace/DeepLearninginApacheSpark/spark-deep-learning-master/python/tests/graph/test_builder.py =============
/usr/local/opt/python/bin/python2.7: No module named nose
Actually, after sbt building, it produces the scala-2.11/spark-deep-learning-assembly*.jar instead of scala-2.12/spark-deep-learning-assembly*.jar. In addition, I installed the python2 at the /usr/local/bin/python2, why it will have /usr/local/opt/python/bin/python2.7: No module named nose.
Activity
RayTsui commentedon Oct 17, 2017
Actually, I am not sure how to use the "sparkdl$ SPARK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh ", can it be executed at the command line, but it will give "sparkdl$: command not found".
allwefantasy commentedon Oct 18, 2017
sparkdl$means your current directory isspark deep learning project. SPARK_HOME is need by pyspark , SCALA_VERSION and SPARK_VERSION are used to locate thespark-deep-learning-assembly*.jar../python/run-tests.sh will setup enviroment and find all py in python/tests and run them one by one.
you should run command
build/sbt assemblyfirst to make sure assembly jar is ready ,then runSPARK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.shphi-dbq commentedon Oct 18, 2017
@RayTsui thank you for reporting the issue!
@allwefantasy thank you for helping out!
In addition, we also have some scripts/sbt-plugins we use to facilitate development process, which we put in #59.
You can try running
SPARK_HOME="path/to/your/spark/home/directory" ./bin/totgen.shwhich will generate pyspark (.py2.spark.shell,.py3.spark.shell) and spark-shell (.spark.shell) REPLs.RayTsui commentedon Oct 18, 2017
@allwefantasy Thanks a lot for your answer, actually, as for the command "SPARK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh", I have few doubts,
the value for each config is fixed and common to all envs, or I need to set the value based on my current env, because I install spark via "brew install apache-spark" instead of downloading the spark with its dependency hadoop(e.g., spark-2.1.1-bin/hadoop). In addition, version number for scala and spark is also based on my env?
do I need to set env variable "SPARK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 " in ~/.bash_profile or I directly run the command "RK_HOME=/usr/local/lib/spark-2.1.1-bin-hadoop2.7 PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1 ./python/run-tests.sh" at the prompt.
after tentative attempt, I still came cross the errors above.
if you have some suggestions, It will help me a lot.
RayTsui commentedon Oct 18, 2017
@phi-dbq Thanks a lot for your response, I will try to what you refer and give necessary feedback.
allwefantasy commentedon Oct 19, 2017
Or you can just run command to finish this:
2.Just keep
PYSPARK_PYTHON=python2 SCALA_VERSION=2.11.8 SPARK_VERSION=2.1.1no change. As I have mentioned, these envs are just for locating the assembly jar. The only env you should set is SPARK_HOME. I suggest that you should not configure them in .bashrc which may have side effect in your other program.step 1:
then you should find the spark-deep-learning-assembly-0.1.0-spark2.1.jar in your-project/target/scala-2.11.
step 2:
Also,you can specify the target file to run instead of the all the files which almost take 30m. Like this:
RayTsui commentedon Oct 20, 2017
@allwefantasy
Hi, I am really appreciated for your explanation, I understood and repeated again, it moves a lot progress, at least the unit test can cover
Name Stmts Miss Cover
sparkdl/graph/init.py 0 0 100%
sparkdl/graph/utils.py 81 64 21%
sparkdl/image/init.py 0 0 100%
sparkdl/image/imageIO.py 94 66 30%
sparkdl/transformers/init.py 0 0 100%
sparkdl/transformers/keras_utils.py 13 7 46%
sparkdl/transformers/param.py 46 26 43%
TOTAL 234 163 30%
But there still exists some error as follows:
ModuleNotFoundError: No module named 'tensorframes'
I guess that the tensorframes can officially support linux 64, but right now I use the mac OS, is that the issue?
thunterdb commentedon Oct 20, 2017
Hello @RayTsui , I have no problem using OSX for development purposes.
Can you run first:
followed by:
You should see a line that writes:
[info] Including: tensorframes-0.2.9-s_2.11.jarthis indicates that tensorframes is properly included in the assembly jar, and that your problem is rather that the proper assembly cannot be found.
RayTsui commentedon Oct 23, 2017
@thunterdb Thanks a lot for your suggestions. I ran the commands, yes I can see the
[info] Including: tensorframes-0.2.8-s_2.11.jar.
And as you said, my issue is about
"List of assembly jars found, the last one will be used:
ls: $DIR/spark-deep-learning-master/python/../target/scala-2.11/spark-deep-learning-assembly*.jar: No such file or directory"
I suppose that all related jars are packaged in spark-deep-learning-assembly*.jar, but my spark-deep-learning-master-assembly-0.1.0-spark2.1.jar is generated at the path
"$DIR/spark-deep-learning-master/target/scala-2.11/spark-deep-learning-master-assembly-0.1.0-spark2.1.jar"
instead of
"$DIR/spark-deep-learning-master/python/../target/scala-2.11/spark-deep-learning-assembly*.jar". And I tried to modified the segment of the run-tests.sh file, but it does not work.
Do you know how to locate the spark-deep-learning-master-assembly-0.1.0-spark2.1.jar?