Skip to content

Latest commit

 

History

History
68 lines (64 loc) · 5.18 KB

machine-learning-service-aks-deploy-config.md

File metadata and controls

68 lines (64 loc) · 5.18 KB
author ms.service ms.topic ms.date ms.author
Blackmist
machine-learning
include
03/16/2020
larryfr

The entries in the deploymentconfig.json document map to the parameters for AksWebservice.deploy_configuration. The following table describes the mapping between the entities in the JSON document and the parameters for the method:

JSON entity Method parameter Description
computeType NA The compute target. For AKS, the value must be aks.
autoScaler NA Contains configuration elements for autoscale. See the autoscaler table.
  autoscaleEnabled autoscale_enabled Whether to enable autoscaling for the web service. If numReplicas = 0, True; otherwise, False.
  minReplicas autoscale_min_replicas The minimum number of containers to use when autoscaling this web service. Default, 1.
  maxReplicas autoscale_max_replicas The maximum number of containers to use when autoscaling this web service. Default, 10.
  refreshPeriodInSeconds autoscale_refresh_seconds How often the autoscaler attempts to scale this web service. Default, 1.
  targetUtilization autoscale_target_utilization The target utilization (in percent out of 100) that the autoscaler should attempt to maintain for this web service. Default, 70.
dataCollection NA Contains configuration elements for data collection.
  storageEnabled collect_model_data Whether to enable model data collection for the web service. Default, False.
authEnabled auth_enabled Whether or not to enable key authentication for the web service. Both tokenAuthEnabled and authEnabled cannot be True. Default, True.
tokenAuthEnabled token_auth_enabled Whether or not to enable token authentication for the web service. Both tokenAuthEnabled and authEnabled cannot be True. Default, False.
containerResourceRequirements NA Container for the CPU and memory entities.
  cpu cpu_cores The number of CPU cores to allocate for this web service. Defaults, 0.1
  memoryInGB memory_gb The amount of memory (in GB) to allocate for this web service. Default, 0.5
appInsightsEnabled enable_app_insights Whether to enable Application Insights logging for the web service. Default, False.
scoringTimeoutMs scoring_timeout_ms A timeout to enforce for scoring calls to the web service. Default, 60000.
maxConcurrentRequestsPerContainer replica_max_concurrent_requests The maximum concurrent requests per node for this web service. Default, 1.
maxQueueWaitMs max_request_wait_time The maximum time a request will stay in thee queue (in milliseconds) before a 503 error is returned. Default, 500.
numReplicas num_replicas The number of containers to allocate for this web service. No default value. If this parameter is not set, the autoscaler is enabled by default.
keys NA Contains configuration elements for keys.
  primaryKey primary_key A primary auth key to use for this Webservice
  secondaryKey secondary_key A secondary auth key to use for this Webservice
gpuCores gpu_cores The number of GPU cores (per-container replica) to allocate for this Webservice. Default is 1. Only supports whole number values.
livenessProbeRequirements NA Contains configuration elements for liveness probe requirements.
  periodSeconds period_seconds How often (in seconds) to perform the liveness probe. Default to 10 seconds. Minimum value is 1.
  initialDelaySeconds initial_delay_seconds Number of seconds after the container has started before liveness probes are initiated. Defaults to 310
  timeoutSeconds timeout_seconds Number of seconds after which the liveness probe times out. Defaults to 2 seconds. Minimum value is 1
  successThreshold success_threshold Minimum consecutive successes for the liveness probe to be considered successful after having failed. Defaults to 1. Minimum value is 1.
  failureThreshold failure_threshold When a Pod starts and the liveness probe fails, Kubernetes will try failureThreshold times before giving up. Defaults to 3. Minimum value is 1.
namespace namespace The Kubernetes namespace that the webservice is deployed into. Up to 63 lowercase alphanumeric ('a'-'z', '0'-'9') and hyphen ('-') characters. The first and last characters can't be hyphens.

The following JSON is an example deployment configuration for use with the CLI:

{
    "computeType": "aks",
    "autoScaler":
    {
        "autoscaleEnabled": true,
        "minReplicas": 1,
        "maxReplicas": 3,
        "refreshPeriodInSeconds": 1,
        "targetUtilization": 70
    },
    "dataCollection":
    {
        "storageEnabled": true
    },
    "authEnabled": true,
    "containerResourceRequirements":
    {
        "cpu": 0.5,
        "memoryInGB": 1.0
    }
}