Skip to content

Latest commit

 

History

History
51 lines (40 loc) · 2.25 KB

arena_submit_horovodjob.md

File metadata and controls

51 lines (40 loc) · 2.25 KB

arena submit horovodjob

Submit horovodjob as training job.

Synopsis

Submit horovodjob as training job.

arena submit horovodjob [flags]

Options

  -a, --annotation stringArray   the annotations
      --cpu string               the cpu resource to use for the training, like 1 for 1 core.
  -d, --data stringArray         specify the datasource to mount to the job, like <name_of_datasource>:<mount_point_on_job>
      --data-dir stringArray     the data dir. If you specify /data, it means mounting hostpath /data into container path /data
  -e, --env stringArray          the environment variables
      --gpus int                 the GPU count of each worker to run the training.
  -h, --help                     help for horovodjob
      --image string             the docker image name of training job
      --memory string            the memory resource to use for the training, like 1Gi.
      --name string              override name
      --rdma                     enable RDMA
      --retry int                retry times.
      --sshPort int              ssh port.
      --sync-image string        the docker image of syncImage
      --sync-mode string         syncMode: support rsync, hdfs, git
      --sync-source string       sync-source: for rsync, it's like 10.88.29.56::backup/data/logoRecoTrain.zip; for git, it's like https://github.com/kubeflow/tf-operator.git
      --workers int              the worker number to run the distributed training. (default 1)
      --working-dir string       working directory to extract the code. If using syncMode, the $workingDir/code contains the code (default "/root")

Options inherited from parent commands

      --arena-namespace string   The namespace of arena system service, like tf-operator (default "arena-system")
      --config string            Path to a kube config. Only required if out-of-cluster
      --loglevel string          Set the logging level. One of: debug|info|warn|error (default "info")
  -n, --namespace string         the namespace of the job (default "default")
      --pprof                    enable cpu profile
      --trace                    enable trace

SEE ALSO

Auto generated by spf13/cobra on 24-Apr-2019