Skip to content

Commit

Permalink
operator: API(v1beta1) changes to support per-container resources
Browse files Browse the repository at this point in the history
Moving the deployment API to `v1beta1` along with supporting per-container
resources.

In Kubernetes resource requiremets are per-container. Existing
`ControllerResources` and `NodeResources` API fields are not suffice to
configure resources for sidecar containers. Hence added new fields to
deployment specification for per-container resources.

Also chose the default cpu,memory resource requests/limits for the
containers as suggested by the VPA (Verticall Pod Autoscaler).

TODOS:
- To support deprecated `v1alpha1` API, needs a conversion webhook in
the cluster to handled conversions between different API versions.

FIXES: intel#616
  • Loading branch information
avalluri committed Dec 2, 2020
1 parent 3106c9d commit edcd09f
Show file tree
Hide file tree
Showing 26 changed files with 1,060 additions and 61 deletions.
12 changes: 12 additions & 0 deletions deploy/common/pmem-csi.intel.com_v1beta1_deployment_cr.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: pmem-csi.intel.com/v1beta1
kind: Deployment
metadata:
name: pmem-csi.intel.com
spec:
deviceMode: "lvm"
nodeSelector:
# When using Node Feature Discovery (NFD):
feature.node.kubernetes.io/memory-nv.dax: "true"
# When using manual node labeling with that label:
# storage: pmem

287 changes: 287 additions & 0 deletions deploy/crd/pmem-csi.intel.com_deployments.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -246,6 +246,293 @@ spec:
type: object
type: object
served: true
storage: false
subresources:
status: {}
- additionalPrinterColumns:
- jsonPath: .spec.deviceMode
name: DeviceMode
type: string
- jsonPath: .spec.nodeSelector
name: NodeSelector
type: string
- jsonPath: .spec.image
name: Image
type: string
- jsonPath: .status.phase
name: Status
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
name: v1beta1
schema:
openAPIV3Schema:
description: Deployment is the Schema for the deployments API
properties:
apiVersion:
description: 'APIVersion defines the versioned schema of this representation
of an object. Servers should convert recognized schemas to the latest
internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
type: string
kind:
description: 'Kind is a string value representing the REST resource this
object represents. Servers may infer this from the endpoint the client
submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
type: string
metadata:
type: object
spec:
description: DeploymentSpec defines the desired state of Deployment
properties:
caCert:
description: CACert encoded root certificate of the CA by which the
registry and node controller certificates are signed If not provided
operator uses a self-signed CA certificate
format: byte
type: string
controllerDriverResources:
description: ControllerDriverResources Compute resources required
by driver container running on master node
properties:
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Limits describes the maximum amount of compute resources
allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Requests describes the minimum amount of compute
resources required. If Requests is omitted for a container,
it defaults to Limits if that is explicitly specified, otherwise
to an implementation-defined value. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
type: object
deviceMode:
description: DeviceMode to use to manage PMEM devices. One of lvm,
direct
type: string
image:
description: PMEM-CSI driver container image
type: string
imagePullPolicy:
description: PullPolicy image pull policy one of Always, Never, IfNotPresent
type: string
kubeletDir:
description: KubeletDir kubelet's root directory path
type: string
labels:
additionalProperties:
type: string
description: Labels contains additional labels for all objects created
by the operator.
type: object
logLevel:
description: LogLevel number for the log verbosity kubebuilder:default=3
type: integer
nodeControllerCert:
description: NodeControllerCert encoded certificate signed by a CA
for node controller server authentication If not provided, provisioned
one by the operator using self-signed CA
format: byte
type: string
nodeControllerKey:
description: NodeControllerPrivateKey encoded private key used for
node controller server certificate If not provided, provisioned
one by the operator
format: byte
type: string
nodeDriverResources:
description: NodeDriverResources Compute resources required by driver
container running on worker nodes
properties:
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Limits describes the maximum amount of compute resources
allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Requests describes the minimum amount of compute
resources required. If Requests is omitted for a container,
it defaults to Limits if that is explicitly specified, otherwise
to an implementation-defined value. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
type: object
nodeRegistrarImage:
description: NodeRegistrarImage CSI node driver registrar sidecar
image
type: string
nodeRegistrarResources:
description: NodeRegistrarResources Compute resources required by
node registrar sidecar container
properties:
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Limits describes the maximum amount of compute resources
allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Requests describes the minimum amount of compute
resources required. If Requests is omitted for a container,
it defaults to Limits if that is explicitly specified, otherwise
to an implementation-defined value. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
type: object
nodeSelector:
additionalProperties:
type: string
description: NodeSelector node labels to use for selection of driver
node
type: object
pmemPercentage:
description: PMEMPercentage represents the percentage of space to
be used by the driver in each PMEM region on every node. This is
only valid for driver in LVM mode. -kubebuilder:validation:Minimum=1
-kubebuilder:validation:Maximum=100 -kubebuilder:default=100
type: integer
provisionerImage:
description: ProvisionerImage CSI provisioner sidecar image
type: string
provisionerResources:
description: ProvisionerResources Compute resources required by provisioner
sidecar container
properties:
limits:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Limits describes the maximum amount of compute resources
allowed. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
requests:
additionalProperties:
anyOf:
- type: integer
- type: string
pattern: ^(\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))(([KMGTPE]i)|[numkMGTPE]|([eE](\+|-)?(([0-9]+(\.[0-9]*)?)|(\.[0-9]+))))?$
x-kubernetes-int-or-string: true
description: 'Requests describes the minimum amount of compute
resources required. If Requests is omitted for a container,
it defaults to Limits if that is explicitly specified, otherwise
to an implementation-defined value. More info: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/'
type: object
type: object
registryCert:
description: RegistryCert encoded certificate signed by a CA for registry
server authentication If not provided, provisioned one by the operator
using self-signed CA
format: byte
type: string
registryKey:
description: RegistryPrivateKey encoded private key used for registry
server certificate If not provided, provisioned one by the operator
format: byte
type: string
type: object
status:
description: DeploymentStatus defines the observed state of Deployment
properties:
conditions:
description: Conditions
items:
description: DeploymentCondition type definition for driver deployment
status conditions
properties:
lastUpdateTime:
description: Last time the condition was probed.
format: date-time
type: string
reason:
description: Message human readable text that explain why this
condition is in this state
type: string
status:
description: Status of the condition, one of True, False, Unknown.
type: string
type:
description: Type of condition.
type: string
required:
- status
- type
type: object
type: array
driverComponents:
items:
description: DriverStatus type definition for representing deployed
driver status
properties:
component:
description: 'DriverComponent represents type of the driver:
controller or node'
type: string
lastUpdated:
description: LastUpdated time of the driver status
format: date-time
type: string
reason:
description: Reason represents the human readable text that
explains why the driver is in this state.
type: string
status:
description: Status represents the state of the component; one
of `Ready` or `NotReady`. Component becomes `Ready` if all
the instances(Pods) of the driver component are in running
state. Otherwise, `NotReady`.
type: string
required:
- component
- reason
- status
type: object
type: array
lastUpdated:
description: LastUpdated time of the deployment status
format: date-time
type: string
phase:
description: Phase indicates the state of the deployment
type: string
reason:
type: string
type: object
type: object
served: true
storage: true
subresources:
status: {}
Expand Down
1 change: 1 addition & 0 deletions deploy/kustomize/olm-catalog/kustomization.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ bases:

resources:
- ../../common/pmem-csi.intel.com_v1alpha1_deployment_cr.yaml
- ../../common/pmem-csi.intel.com_v1beta1_deployment_cr.yaml
- ../../crd/pmem-csi.intel.com_deployments.yaml

images:
Expand Down
9 changes: 9 additions & 0 deletions docs/DEVELOPMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,12 @@ Resource requirements depend on the workload. To generate some load, run
make test_e2e TEST_E2E_FOCUS=lvm-production.*late.binding.*stress.test
```

Alternatively, could run [`hack/stress-driver.sh`](hack/stress-driver.sh)
helper script to generate the load on the driver
```console
ROUNDS=500 VOL_COUNT=5 ./hack/stress-driver.sh
```

Now resource recommendations can be retrieved with:

```console
Expand All @@ -295,6 +301,9 @@ kubectl describe vpa
kubectl get vpa pmem-csi-node -o jsonpath='{range .status.recommendation.containerRecommendations[*]}{.containerName}{":\n\tRequests: "}{.lowerBound}{"\n\tLimits: "}{.upperBound}{"\n"}{end}'
```

The default resource requirements used for the driver deployments by the operator
are chosen from the VPA recommendations described in this section.

## Switching device mode

If device mode is switched between LVM and direct(aka ndctl), please keep
Expand Down
2 changes: 1 addition & 1 deletion docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -421,7 +421,7 @@ tools and APIs.

The driver deployment is controlled by a cluster-scoped [custom resource](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)
named [`Deployment`](./install.md#pmem-csi-deployment-crd) in the
`pmem-csi.intel.com/v1alpha1` API group. The operator runs inside the cluster
`pmem-csi.intel.com/v1beta1` API group. The operator runs inside the cluster
and listens for deployment changes. It makes sure that the required Kubernetes
objects are created for a driver deployment.
Refer to [Deployment CRD](./install.md#deployment) for details.
Loading

0 comments on commit edcd09f

Please sign in to comment.