This is unreleased documentation for Policy Manager 1.26-next. |
Configuring the Kubewarden stack for production
Kubewarden provides features for reliability and correct scheduling of its components in a Kubernetes cluster. Some of the hints on this page come from Kubewarden community members using Kubewarden at scale.
If you want to see a real example of running Kubewarden at scale check out the Kubewarden in a Large-Scale Environment documentation page |
Configuring Tolerations and Affinity/Anti-Affinity
By using the tolerations
and affinity
fields, operators can fine-tune the scheduling and reliability of the Kubewarden stack to meet their specific deployment needs and constraints. For more details on the exact fields and their configurations, refer to the Kubernetes documentation on Taints and Tolerations and Affinity and Anti-Affinity.
Starting from version 1.15 of the Kubewarden stack, the Kubewarden Helm charts ship with two new values:
-
.global.tolerations
-
.global.affinity
These Helm chart values allow users to define Kubernetes tolerations and affinity/anti-affinity settings for the Kubewarden stack, including the controller deployment, audit scanner cronjob, and the default PolicyServer
custom resource.
Tolerations
The tolerations
value is an array where users can specify Kubernetes tolerations for the Kubewarden components. Tolerations allow pods to be scheduled on nodes with matching taints. This is useful for managing where pods can be scheduled, especially in scenarios involving node maintenance, dedicated workloads, or specific hardware requirements:
global:
tolerations:
- key: "key1"
operator: "Equal"
value: "value1"
effect: "NoSchedule"
- key: "key2"
operator: "Equal"
value: "value2"
effect: "NoExecute"
In this example, the tolerations defined are applied to the controller deployment, audit scanner cronjob, and the default PolicyServer
custom resource.
Affinity/Anti-Affinity
The affinity
value allows users to define Kubernetes affinity and anti-affinity rules for the Kubewarden components. Affinity rules constrain pods to specific nodes, while anti-affinity rules prevent pods from being scheduled on certain nodes or in close proximity to other pods. These settings are useful for ensuring high availability, fault tolerance, and optimized resource usage in a cluster.
global:
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S1
topologyKey: topology.kubernetes.io/zone
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: security
operator: In
values:
- S2
topologyKey: topology.kubernetes.io/zone
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: label-1
operator: In
values:
- key-1
- weight: 50
preference:
matchExpressions:
- key: label-2
operator: In
values:
- key-2
In this example, the affinity rules will be applied to the controller deployment, audit scanner cronjob, and the default PolicyServer
custom resource.
The previous affinity configuration available in the kubewarden-default
Helm chart, which was used to define the affinity configuration for the PolicyServer
only, has been removed in favor of the global affinity
value. This change simplifies the configuration process by providing a single approach to defining affinity and anti-affinity rules for all Kubewarden components.
The old |
Configuring priorityClasses
By using priorityClasses, operators can enforce a scheduling priority for the workload pods of the Kubewarden stack. This ensures the Kubewarden workload is available over other workloads, preventing eviction and ensuring service reliability. For more information, refer to the Kubernetes documentation on Priorityclasses.
Starting from version 1.25 of the Kubewarden stack, the Kubewarden Helm charts ship with a new value:
-
.global.priorityClassName
The priorityClass defined by name in this value is applied to the controller deployment pods, and the pods of the default PolicyServer
custom resource.
The .global.priorityClassName
value expects a name of an existing PriorityClass. As an example, we could use:
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
name: kubewarden-high-priority
value: 1000000
globalDefault: false
description: "This priority class should be used for XYZ service pods only."
Kubernetes already ships with two PriorityClasses that are good candidates: system-cluster-critical
and system-node-critical
. These are common classes and are used to ensure that critical components are always scheduled first.
If you delete a PriorityClass, existing Pods that use the name of the deleted PriorityClass remain unchanged, but following Pods that use the name of the deleted PriorityClass will not be created by Kubernetes. |
PolicyServer
production configuration
PolicyServers
are critical to the cluster. Reliability of them is important as they process Admission Requests destined for the Kubernetes API via the Validating and Mutating Webhooks.
As with other Dynamic Admission Controllers, this process happens before requests reach the Kubernetes API server. Latency or service delays by the Dynamic Admission Controller may introduce cluster inconsistency, Denial of Service, or deadlock.
Kubewarden provides several ways to increase the reliability of PolicyServers
. Production deployments can vary a great deal, it is up to the operator to configure the deployment for their needs.
PodDisruptionBudgets
The Kubewarden controller can create a PodDisruptionBudget (PDB) for the PolicyServer
Pods. This controls the range of PolicyServer
Pod replicas associated with the PolicyServer
, ensuring high availability and controlled eviction in case of node maintenance, scaling operations or cluster upgrades.
This is achieved by setting spec.minAvailable
, or spec.maxUnavailable
of the PolicyServer
resource:
-
minAvailable
: specifies the minimum number ofPolicyServer
Pods that must be available at all times. Can be an integer or a percentage.Useful for maintaining the operational integrity of the `PolicyServer`, ensuring that policies are continuously enforced without interruption.
-
maxUnavailable
: specifies the maximum number ofPolicyServer
Pods that can be unavailable at any given time. Can be an integer or a percentage.Useful for performing rolling updates or partial maintenance without fully halting the policy enforcement mechanism.
You can specify only one of |
Configuring minAvailable or maxUnavailable
apiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
minAvailable: 2 # ensure at least two policy-server Pods are available at all times
apiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
maxUnavailable: "30%" # ensure no more than 30% of policy-server Pods are unavailable at all times
Affinity / Anti-affinity
The Kubewarden controller can set the affinity of PolicyServer
Pods. This allows constraint of Pods to specific nodes, or Pods against other Pods. For more information on Affinity, see the Kubernetes docs.
Kubernetes affinity configuration allows constraining Pods to nodes (via spec.affinity.nodeAffinity
) or constraining Pods with regards to other Pods (via spec.affinity.podAffinity
). Affinity can be set as a soft constraint (with preferredDuringSchedulingIgnoredDuringExecution
) or a hard one (with requiredDuringSchedulingIgnoredDuringExecution
).
Affinity / anti-affinity matches against specific labels, be it nodes' labels (e.g: topology.kubernetes.io/zone
set to antarctica-east1
) or Pods labels. Pods created from PolicyServer
definitions have a label kubewarden/policy-server
set to the name of the PolicyServer
. (e.g: kubewarden/policy-server: default
).
Inter-pod affinity/anti-affinity require substantial amounts of processing and are not recommended in clusters larger than several hundred nodes. |
To configure affinity for a PolicyServer
, set its spec.affinity
field. This field accepts a YAML object matching the contents of a Pod’s spec.affinity
.
Configuring Affinity / Anti-affinity
PolicyServer
Pods across zones and hostnamesapiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: kubewarden/policy-server
operator: In
values:
- your-policy-server
topologyKey: topology.kubernetes.io/zone
- labelSelector:
matchExpressions:
- key: kubewarden/policy-server
operator: In
values:
- your-policy-server
topologyKey: kubernetes.io/hostname
PolicyServer
pods in control-plane nodesapiVersion: policies.kubewarden.io/v1
kind: PolicyServer
metadata:
name: your-policy-server
spec:
# Other configuration fields
affinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: kubewarden/policy-server
operator: In
values:
- your-policy-server
topologyKey: node-role.kubernetes.io/control-plane
Limits and Requests
The Kubewarden controller can set the resource limits and requests of PolicyServer
Pods. This specifies how much of each resource each of the containers associated with the PolicyServer
Pods needs. For PolicyServers
, only cpu
and memory
resources are relevant. See the Kubernetes docs on resource units for more information.
This is achieved by setting the following PolicyServer
resource fields:
-
spec.limits
: Limits on resources, enforced by the container runtime. Different runtimes can have different ways to implement the restrictions. -
spec.requests
: Amount of resources to reserve for each container. It is possible and allowed for a container to use more resource than itsrequest
.If omitted, it defaults to `spec.limits` if that is set (unless `spec.requests` of containers is set to some defaults via an admission mechanism).
Undercommitting resources of |
PriorityClasses
The Kubewarden controller can set the PriorityClass used for the pods of PolicyServers
. This means PolicyServer
workloads are scheduled with priority, preventing eviction and ensuring service reliability. See the Kubernetes docs for more information.
If you delete a PriorityClass, existing Pods that use the name of the deleted PriorityClass remain unchanged, but following Pods that use the name of the deleted PriorityClass will not be created by Kubernetes. |
Isolate Policy Workloads
To ensure stability and high performance at scale, users can run separate PolicyServer
deployments to isolate different workloads.
-
Dedicate one
PolicyServer
to Context-Aware Policies: These policies are more resource-intensive because they query the Kubernetes API server or other external services like Sigstore, OCI registries, among others. Isolating them prevents a slow policy from creating a bottleneck for other, faster policies. -
Use another
PolicyServer
for All Other Policies: Run regular, self-contained policies on a separate server to ensure low latency for the most common admission requests.
You can also consider splitting even further the workload. For example, if you have some policies that are slow and require a bigger execution timeout, consider moving them into a dedicated PolicyServer
. This way you ensure that policies will not block the workers to evaluate other requests.
Resource Allocation and Scaling
To handle high traffic and ensure availability, provide sufficient resources and scale your replicas.
-
Allocate Sufficient Resources: In high-traffic environments, allocate generous resources to each replica. Do not starve the
PolicyServers
, as insufficient CPU or memory is a primary cause of request timeouts. Remember thatPolicyServers
will receive requests from control plane and the Kubewarden audit scanner. -
Scale for High Availability: For deployments handling hundreds of requests per second, run a high number of replicas. This distributes the load effectively and ensures that the failure of a few pods does not impact the cluster’s operation.
Start with a baseline of 3-5 replicas and monitor CPU and memory usage. Scale the replica count as needed.
Effective Auditing at Scale
To run audits efficiently on large clusters, fine-tune the audit scanner for performance and parallelism.
-
Schedule Audits Periodically: Running a scan frequently can be a good balance between catching configuration drift and minimizing load on the API server.
-
Tune Parallelism Aggressively: The key to fast audits is parallelization. With high-parallelism settings, you can reduce audit times on massive clusters to just over an hour.
It’s important to remember that audit scanner sends requests to |
Set disableStore: true
to reduce load if you consume audit results from logs and do not require