Skip to content

Load balancing the adapter in Kubernetes

Scott Ganyo edited this page Feb 9, 2021 · 3 revisions

In the sample configuration, a standard Kubernetes service is defined to expose the gRPC service from the Apigee Adapter. However, it is important to be aware how this effects load balancing.

A Kubernetes service generally uses iptables with a pod selection rule to make it behaves as a load balancer. This is not useful for services such as gRPC with long-lived TCP/TLS connections, however, because the iptables will only be referenced once during the lifetime of a connection causing all requests to be sent to the same workload.

This is generally not a problem: The number of Envoy proxies will generally be larger than the number of adapter instances and as a result, each adapter instance will more or less receive an appropriate fraction of requests. Furthermore, the adapter is configured by default to terminate TCP/TLS connections after 10 minutes. So connections will be redistributed every 10 minutes regardless.

However, when more dynamic load balancing is deemed necessary, it is easy to do so by configuring client-side load balancing in Envoy. Simply add the field clusterIP: None to the spec of the adapter's Kubernetes Service definition. (Note this clusterIP field is immutable so one may need to delete any existing service first). Once done, the Kubernetes Service becomes headless and will return a list of the pod IPs which then the Envoy client will load balance.

More reading:

  1. Load balancing and scaling long-lived connections in Kubernetes
  2. Headless Kubernetes services