Skip to content
This repository was archived by the owner on Nov 16, 2019. It is now read-only.
This repository was archived by the owner on Nov 16, 2019. It is now read-only.

Feature extraction mode running slow #293

@Marcteen

Description

@Marcteen

Hi, there. I'm using CaffeOnSpark to extract deep feature(dimention is 4096) from pictures. The model I use is vgg_face,the content of solver.prototxt is

net: "VGG_FACE_deploy.prototxt" type: "Adam" test_iter: 30 test_interval: 5000 base_lr: 0.000001 momentum: 0.9 momentum2: 0.999 lr_policy: "fixed" gamma:0.8 stepsize:100000 display: 2500 max_iter: 1500000 snapshot: 5000 snapshot_prefix: "faceId-snap" solver_mode: CPU

and the spark submit command is

spark-submit --master yarn --deploy-mode cluster \ --driver-memory 3g \ --driver-cores 2 \ --num-executors 100 \ --executor-cores 1 \ --executor-memory 2g \ --files /.../adam_solver.prototxt,/.../VGG_FACE_deploy.prototxt \ --conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}" \ --conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}" \ --class com.yahoo.ml.caffe.CaffeOnSpark \ ${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar \ -features fc7 \ -clusterSize 100 \ -label label \ -conf adam_solver.prototxt \ -connection ethernet \ -model hdfs:///.../VGG_FACE.caffemodel \ -output hdfs:///.../vggFaces

When I use a sequence file consists of 3k image with 20 executors(clusterSize set to 20 of course), feature extraction terminates in 4 min. But when I process the sequence file with 450k using 100 executors, it just keep running exceeding 13 hours(no idea how long it would really takes). Since the deep conv net requires heavy cpu load, I think the time cost may be not so resonable here. Maybe I've mistaken some thing. Any help will be appreciated!!!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions