Skip to content

Commit

Permalink
Deploy with new scheduler (#729)
Browse files Browse the repository at this point in the history
  • Loading branch information
hunhoffe authored Sep 26, 2022
1 parent 4b3874f commit ba72e9a
Show file tree
Hide file tree
Showing 19 changed files with 844 additions and 16 deletions.
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ env:
- TRAVIS_KUBE_VERSION=v1.19 OW_INCLUDE_SYSTEM_TESTS=false OW_CONTAINER_FACTORY=kubernetes OW_LEAN_MODE=true
- TRAVIS_KUBE_VERSION=v1.20 OW_INCLUDE_SYSTEM_TESTS=false OW_CONTAINER_FACTORY=kubernetes
- TRAVIS_KUBE_VERSION=v1.21 OW_INCLUDE_SYSTEM_TESTS=false OW_CONTAINER_FACTORY=kubernetes
- TRAVIS_KUBE_VERSION=v1.21 OW_INCLUDE_SYSTEM_TESTS=false OW_CONTAINER_FACTORY=kubernetes OW_SCHEDULER_ENABLED=true

services:
- docker
Expand Down
12 changes: 12 additions & 0 deletions docs/configurationChoices.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,16 @@ components is not currently supported:
better management of the database and decouples its lifecycle from that of the OpenWhisk deployment.
- The event providers: alarmprovider and kafkaprovider.
### Openwhisk Scheduler
By default, the scheduler is disabled. To enable the scheduler, add the following
to your `mycluster.yaml`

```yaml
scheduler:
enabled: true
```

### Using an external database

You may want to use an external CouchDB or Cloudant instance instead
Expand Down Expand Up @@ -180,6 +190,8 @@ k8s:
enabled: false
```

Currently, etcd persistence is not supported.

### Selectively Deploying Event Providers

The default settings of the Helm chart will deploy OpenWhisk's alarm
Expand Down
2 changes: 1 addition & 1 deletion docs/k8s-custom-build-cluster-scaleup.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Modifying the above mentioned parameters, one can easily increase the concurrenc
In order to further increase the scale-up beyond `Small Scale`, one needs to modify the following additional configurations appropriately (on top of the above mentioned):
* `invoker:jvmHeapMB`: jvmHeap memory available to each invoker instance. May or may not require increase based on running functions. For more information check `troubleshooting` below.
* `invoker:containerFactory:_:replicaCount`: number of invoker instances that will be used to handle the incoming workload. By default, there is only one invoker instance which can become overwhelmed if workload goes beyond a certain threshold.
* `controller:replicaCount`: number of controller instances that will be used to handle the incoming workload. Same as invoker instances.
* `controller:replicaCount`: number of controller instances that will be used to handle the incoming workload. Same as invoker and scheduler instances.
* `invoker:options`: Log processing at the invoker can become a bottleneck for the KubernetesContainerFactory. One might try disabling invoker log processing by setting it to `-Dwhisk.spi.LogStoreProvider=org.apache.openwhisk.core.containerpool.logging.LogDriverLogStoreProvider`. In general, one needs to offload log processing from the invoker to a node-level log store provider if one is trying to push a large load through the system.

## Troubleshooting
Expand Down
4 changes: 2 additions & 2 deletions docs/k8s-kind.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,8 +94,8 @@ OpenWhisk apihost property to be set to localhost:31001
## Hints and Tips

If you are working on the core OpenWhisk system and want
to use a locally built controller or invoker image to test
your changes, you need to push the image to the docker image
to use a locally built controller, invoker, or scheduler image
to test your changes, you need to push the image to the docker image
repository inside the `kind` cluster.

For example, suppose I had a local change to the controller
Expand Down
43 changes: 43 additions & 0 deletions helm/openwhisk/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,11 @@ app: {{ template "openwhisk.fullname" . }}
{{ .Release.Name }}-controller.{{ .Release.Namespace }}.svc.{{ .Values.k8s.domain }}
{{- end -}}

{{/* hostname for scheduler */}}
{{- define "openwhisk.scheduler_host" -}}
{{ .Release.Name }}-scheduler.{{ .Release.Namespace }}.svc.{{ .Values.k8s.domain }}
{{- end -}}

{{/* hostname for database */}}
{{- define "openwhisk.db_host" -}}
{{- if .Values.db.external -}}
Expand All @@ -68,6 +73,15 @@ app: {{ template "openwhisk.fullname" . }}
{{- end -}}
{{- end -}}

{{/* hostname for etcd */}}
{{- define "openwhisk.etcd_host" -}}
{{- if .Values.etcd.external -}}
{{ .Values.etcd.host }}
{{- else -}}
{{ .Release.Name }}-etcd.{{ .Release.Namespace }}.svc.{{ .Values.k8s.domain }}
{{- end -}}
{{- end -}}

{{/* client connection string for zookeeper cluster (server1:port,server2:port, ... serverN:port)*/}}
{{- define "openwhisk.zookeeper_connect" -}}
{{- if .Values.zookeeper.external -}}
Expand Down Expand Up @@ -196,10 +210,24 @@ app: {{ template "openwhisk.fullname" . }}
value: {{ .Values.whisk.limits.activation.payload.max | quote }}
{{- end -}}

{{/* Environment variables for configuring etcd */}}
{{- define "openwhisk.etcdConfigEnvVars" -}}
- name: "CONFIG_whisk_cluster_name"
value: {{ .Values.etcd.clusterName | quote }}
- name: "CONFIG_whisk_etcd_hosts"
value: {{ include "openwhisk.etcd_host" . }}:{{ .Values.etcd.port }}
- name: "CONFIG_whisk_etcd_lease_timeout"
value: {{ .Values.etcd.leaseTimeout | quote }}
- name: "CONFIG_whisk_etcd_pool_threads"
value: {{ .Values.etcd.poolThreads | quote }}
{{- end -}}

{{/* Environment variables for configuring kafka topics */}}
{{- define "openwhisk.kafkaConfigEnvVars" -}}
- name: "CONFIG_whisk_kafka_replicationFactor"
value: {{ .Values.whisk.kafka.replicationFactor | quote }}
- name: "CONFIG_whisk_kafka_topics_prefix"
value: {{ .Values.whisk.kafka.topics.prefix | quote }}
- name: "CONFIG_whisk_kafka_topics_cacheInvalidation_retentionBytes"
value: {{ .Values.whisk.kafka.topics.cacheInvalidation.retentionBytes | quote }}
- name: "CONFIG_whisk_kafka_topics_cacheInvalidation_retentionMs"
Expand All @@ -224,12 +252,27 @@ app: {{ template "openwhisk.fullname" . }}
value: {{ .Values.whisk.kafka.topics.health.retentionMs | quote }}
- name: "CONFIG_whisk_kafka_topics_health_segmentBytes"
value: {{ .Values.whisk.kafka.topics.health.segmentBytes | quote }}

- name: "CONFIG_whisk_kafka_topics_invoker_retentionBytes"
value: {{ .Values.whisk.kafka.topics.invoker.retentionBytes | quote }}
- name: "CONFIG_whisk_kafka_topics_invoker_retentionMs"
value: {{ .Values.whisk.kafka.topics.invoker.retentionMs | quote }}
- name: "CONFIG_whisk_kafka_topics_invoker_segmentBytes"
value: {{ .Values.whisk.kafka.topics.invoker.segmentBytes | quote }}

- name: "CONFIG_whisk_kafka_topics_scheduler_retentionBytes"
value: {{ .Values.whisk.kafka.topics.scheduler.retentionBytes | quote }}
- name: "CONFIG_whisk_kafka_topics_scheduler_retentionMs"
value: {{ .Values.whisk.kafka.topics.scheduler.retentionMs | quote }}
- name: "CONFIG_whisk_kafka_topics_scheduler_segmentBytes"
value: {{ .Values.whisk.kafka.topics.scheduler.segmentBytes | quote }}

- name: "CONFIG_whisk_kafka_topics_creationAck_retentionBytes"
value: {{ .Values.whisk.kafka.topics.creationAck.retentionBytes | quote }}
- name: "CONFIG_whisk_kafka_topics_creationAck_retentionMs"
value: {{ .Values.whisk.kafka.topics.creationAck.retentionMs | quote }}
- name: "CONFIG_whisk_kafka_topics_creationAck_segmentBytes"
value: {{ .Values.whisk.kafka.topics.creationAck.segmentBytes | quote }}
{{- end -}}

{{/* tlssecretname for ingress */}}
Expand Down
22 changes: 22 additions & 0 deletions helm/openwhisk/templates/_readiness.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,17 @@
command: ["sh", "-c", 'cacert="/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"; token="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)"; while true; do rc=$(curl -sS --cacert $cacert --header "Authorization: Bearer $token" https://kubernetes.default.svc/api/v1/namespaces/{{ .Release.Namespace }}/endpoints/{{ .Release.Name }}-kafka | jq -r ".subsets[].addresses | length"); echo "num ready kafka endpoints is $rc"; if [ $rc -gt 0 ]; then echo "Success: ready kafka endpoint!"; break; fi; echo "kafka not ready yet; sleeping for 3 seconds"; sleep 3; done;']
{{- end -}}

{{/* Init container that waits for etcd to be ready */}}
{{- define "openwhisk.readiness.waitForEtcd" -}}
- name: "wait-for-etcd"
image: "{{- .Values.docker.registry.name -}}{{- .Values.utility.imageName -}}:{{- .Values.utility.imageTag -}}"
imagePullPolicy: "IfNotPresent"
env:
- name: "READINESS_URL"
value: http://{{ include "openwhisk.etcd_host" . }}:{{ .Values.etcd.port }}/health
command: ["sh", "-c", "while true; do echo 'checking etcd readiness'; health_result=$(curl -m 5 $READINESS_URL) && echo $health_result | jq -e '. | select(.health==\"true\")'; result=$?; if [ $result -eq 0 ]; then echo 'Success: etcd is ready!'; break; fi; echo '...not ready yet; sleeping 3 seconds before retry'; sleep 3; done;"]
{{- end -}}

{{/* Init container that waits for zookeeper to be ready */}}
{{- define "openwhisk.readiness.waitForZookeeper" -}}
- name: "wait-for-zookeeper"
Expand All @@ -57,6 +68,17 @@
command: ["sh", "-c", "result=1; until [ $result -eq 0 ]; do echo 'Checking controller readiness'; wget -T 5 --spider $READINESS_URL; result=$?; sleep 1; done; echo 'Success: controller is ready'"]
{{- end -}}

{{/* Init container that waits for scheduler to be ready */}}
{{- define "openwhisk.readiness.waitForScheduler" -}}
- name: "wait-for-scheduler"
image: "{{- .Values.docker.registry.name -}}{{- .Values.busybox.imageName -}}:{{- .Values.busybox.imageTag -}}"
imagePullPolicy: "IfNotPresent"
env:
- name: "READINESS_URL"
value: http://{{ include "openwhisk.scheduler_host" . }}:{{ .Values.scheduler.endpoints.port }}/ping
command: ["sh", "-c", "result=1; until [ $result -eq 0 ]; do echo 'Checking scheduler readiness'; wget -T 5 --spider $READINESS_URL; result=$?; sleep 1; done; echo 'Success: scheduler is ready'"]
{{- end -}}

{{/* Init container that waits for at least 1 healthy invoker */}}
{{- define "openwhisk.readiness.waitForHealthyInvoker" -}}
- name: "wait-for-healthy-invoker"
Expand Down
29 changes: 28 additions & 1 deletion helm/openwhisk/templates/controller-pod.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,9 @@ spec:
{{- if not .Values.controller.lean }}
# The controller must wait for kafka and/or couchdb to be ready before it starts
{{ include "openwhisk.readiness.waitForKafka" . | indent 6 }}
{{- if .Values.scheduler.enabled }}
{{ include "openwhisk.readiness.waitForEtcd" . | indent 6 }}
{{- end }}
{{- end }}
{{ include "openwhisk.readiness.waitForCouchDB" . | indent 6 }}
{{- if eq .Values.activationStoreBackend "ElasticSearch" }}
Expand All @@ -85,7 +88,7 @@ spec:
- name: controller
containerPort: {{ .Values.controller.port }}
- name: akka-remoting
containerPort: 2552
containerPort: 25520
- name: akka-mgmt-http
containerPort: 19999
{{- if .Values.controller.lean }}
Expand Down Expand Up @@ -114,6 +117,11 @@ spec:
- name: "TZ"
value: {{ .Values.docker.timezone | quote }}

- name: "POD_IP"
valueFrom:
fieldRef:
fieldPath: status.podIP

- name: "CONFIG_whisk_info_date"
valueFrom:
configMapKeyRef:
Expand All @@ -137,6 +145,15 @@ spec:
- name: "RUNTIMES_MANIFEST"
value: {{ template "openwhisk.runtimes_manifest" . }}

# scheduler settings
{{ if .Values.scheduler.enabled }}
- name: "CONFIG_whisk_spi_LoadBalancerProvider"
value: "org.apache.openwhisk.core.loadBalancer.FPCPoolBalancer"

- name: "CONFIG_whisk_spi_EntitlementSpiProvider"
value: "org.apache.openwhisk.core.entitlement.FPCEntitlementProvider"
{{ end }}

# Action limits
{{ include "openwhisk.limitsEnvVars" . | indent 8 }}

Expand All @@ -151,11 +168,17 @@ spec:
value: "{{ include "openwhisk.kafka_connect" . }}"
{{ include "openwhisk.kafkaConfigEnvVars" . | indent 8 }}

# etcd properties
{{- if .Values.scheduler.enabled }}
{{ include "openwhisk.etcdConfigEnvVars" . | indent 8 }}
{{- end }}

# properties for DB connection
{{ include "openwhisk.dbEnvVars" . | indent 8 }}

- name: "CONTROLLER_INSTANCES"
value: {{ .Values.controller.replicaCount | quote }}

{{- if gt (int .Values.controller.replicaCount) 1 }}
- name: "CONFIG_whisk_cluster_useClusterBootstrap"
value: "true"
Expand All @@ -169,7 +192,11 @@ spec:
value: "name={{ .Release.Name }}-controller"
- name: "CONFIG_akka_discovery_kubernetesApi_podPortName"
value: "akka-mgmt-http"
{{- else }}
- name: "CONFIG_akka_cluster_seedNodes_0"
value: "akka://controller-actor-system@$(POD_IP):25520"
{{- end }}

{{- if .Values.metrics.prometheusEnabled }}
- name: "OPENWHISK_ENCODED_CONFIG"
value: {{ template "openwhisk.whiskconfig" . }}
Expand Down
114 changes: 114 additions & 0 deletions helm/openwhisk/templates/etcd-pod.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

{{ if not .Values.etcd.external }}
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}-etcd
labels:
name: {{ .Release.Name }}-etcd
{{ include "openwhisk.label_boilerplate" . | indent 4 }}
spec:
replicas: {{ .Values.etcd.replicaCount }}
selector:
matchLabels:
name: {{ .Release.Name }}-etcd
{{- if .Values.k8s.persistence.enabled }}
strategy:
type: "Recreate"
{{- end }}
template:
metadata:
labels:
name: {{ .Release.Name }}-etcd
{{ include "openwhisk.label_boilerplate" . | indent 8 }}
spec:
restartPolicy: {{ .Values.etcd.restartPolicy }}

{{- if .Values.affinity.enabled }}
affinity:
{{ include "openwhisk.affinity.core" . | indent 8 }}
{{ include "openwhisk.affinity.selfAntiAffinity" ( printf "%s-etcd" .Release.Name | quote ) | indent 8 }}
{{- end }}

{{- if .Values.toleration.enabled }}
tolerations:
{{ include "openwhisk.toleration.core" . | indent 8 }}
{{- end }}

{{- if .Values.k8s.persistence.enabled }}
volumes:
- name: etcd-data
persistentVolumeClaim:
claimName: {{ .Release.Name }}-etcd-pvc
{{- end }}

{{- if .Values.k8s.persistence.enabled }}
initContainers:
- name: etcd-init
image: "{{- .Values.docker.registry.name -}}{{- .Values.busybox.imageName -}}:{{- .Values.busybox.imageTag -}}"
command:
- chown
- -v
- -R
- 999:999
- /data
volumeMounts:
- mountPath: /data
name: etcd-data
readOnly: false
{{- end }}
{{ include "openwhisk.docker.imagePullSecrets" . | indent 6 }}
# current command will always restart from scratch (no persistence)
containers:
- name: etcd
image: "{{- .Values.docker.registry.name -}}{{- .Values.etcd.imageName -}}:{{- .Values.etcd.imageTag -}}"
command:
- /usr/local/bin/etcd
- --data-dir=/data
- --name
- etcd0
- --initial-advertise-peer-urls
- http://127.0.0.1:2480
- --advertise-client-urls
- http://0.0.0.0:{{ .Values.etcd.port }}
- --listen-peer-urls
- http://127.0.0.1:2480
- --listen-client-urls
- http://0.0.0.0:{{ .Values.etcd.port }}
- --initial-cluster
- etcd0=http://127.0.0.1:2480
- --initial-cluster-state
- new
- --initial-cluster-token
- openwhisk-etcd-token
- --quota-backend-bytes
- "0"
- --snapshot-count
- "100000"
- --auto-compaction-retention
- "1"
- --auto-compaction-mode
- periodic
- --log-level
- info
imagePullPolicy: {{ .Values.etcd.imagePullPolicy | quote }}
ports:
- name: etcd
containerPort: {{ .Values.etcd.port }}
{{ end }}
34 changes: 34 additions & 0 deletions helm/openwhisk/templates/etcd-pvc.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

{{- if and (not .Values.etcd.external) .Values.k8s.persistence.enabled }}
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: {{ .Release.Name }}-etcd-pvc
labels:
{{ include "openwhisk.label_boilerplate" . | indent 4 }}
spec:
{{- if not .Values.k8s.persistence.hasDefaultStorageClass }}
storageClassName: {{ .Values.k8s.persistence.explicitStorageClass }}
{{- end }}
accessModes:
- ReadWriteOnce
resources:
requests:
storage: {{ .Values.etcd.persistence.size }}
{{- end }}
Loading

0 comments on commit ba72e9a

Please sign in to comment.