Skip to content

Improving the default PDB implementation. #8780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 65 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 58 commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
280634c
Nearly functional implementation.
naemono Jul 23, 2025
50c1aaa
Adding additional tests.
naemono Jul 23, 2025
85c661a
Move to dfs.
naemono Jul 24, 2025
e8367cb
Restore old disruption behavior.
naemono Jul 25, 2025
0d65a39
Fix get most conservative role
naemono Jul 25, 2025
657121e
Optimization
naemono Jul 25, 2025
d9dcc1e
Adding additional unit tests.
naemono Jul 29, 2025
4fa170a
Simplify the sorting.
naemono Jul 29, 2025
b3a166b
Simplify further.
naemono Jul 29, 2025
a4951be
Comments update; wrap the error.
naemono Jul 29, 2025
ec17cd3
Remove some comments.
naemono Jul 29, 2025
33fe7d0
Optimizations
naemono Jul 30, 2025
fc7059d
Break the dfs tasks into smaller funcs
naemono Jul 31, 2025
51aab4b
revert license adjustment
naemono Jul 31, 2025
30e223f
remove comment
naemono Jul 31, 2025
b59475a
adjust var name
naemono Jul 31, 2025
f6aa60e
updating comments.
naemono Jul 31, 2025
6d70499
comment update.
naemono Jul 31, 2025
2d15efd
remove tab
naemono Jul 31, 2025
ac580ec
pre-allocate empty slices of slices.
naemono Jul 31, 2025
a4aad89
fix lint issues.
naemono Jul 31, 2025
eb25e46
Update CRD comments/docs
naemono Jul 31, 2025
08b882e
Adjust some wording according to review notes.
naemono Aug 4, 2025
cc71d0e
Restore old behavior for single pdb for a whole cluster.
naemono Aug 4, 2025
b021c97
Use single data roles grouping
naemono Aug 4, 2025
27cc308
Ensure pdbs that should not exist are deleted.
naemono Aug 4, 2025
43a0e7f
Ensure checks are role-specific.
naemono Aug 5, 2025
675bd2e
Just build coord logic into the func itself
naemono Aug 5, 2025
3fdc4dc
naming
naemono Aug 5, 2025
2f15c01
wip migrating to different algorithm
naemono Aug 5, 2025
a65b56a
fixing unit tests
naemono Aug 5, 2025
f0c0b1a
Fixing all unit tests
naemono Aug 5, 2025
e755bc1
Merge branch 'main' into 2936/default-pdb-improvements
naemono Aug 5, 2025
3a0d3d0
Fix the ordering issue
naemono Aug 5, 2025
27277a1
revert test name change.
naemono Aug 5, 2025
49a9faf
Comments fix
naemono Aug 6, 2025
bcc802c
renaming files.
naemono Aug 6, 2025
1b6adff
rename again
naemono Aug 6, 2025
0a3699c
rename test file also.
naemono Aug 6, 2025
f53346c
Nearly fixed all unit tests
naemono Aug 7, 2025
83d674b
Unit tests passing.
naemono Aug 7, 2025
982efef
use nodeRoles not strings
naemono Aug 7, 2025
4b2d695
naming
naemono Aug 7, 2025
41ecaed
comments
naemono Aug 7, 2025
96e3e46
Handle both v1 and v1beta1 PDB objects.
naemono Aug 7, 2025
5af35a4
Use sets instead
naemono Aug 7, 2025
2bb18a4
Create a coordinating nodeRole and use it.
naemono Aug 7, 2025
0176f96
Use existing ownerref func
naemono Aug 8, 2025
6361851
Move the naming of the pdb func.
naemono Aug 8, 2025
8466261
expired license comment.
naemono Aug 8, 2025
091ce74
Fix unit tests
naemono Aug 8, 2025
254b60c
Move priority slice to Noderole slices
naemono Aug 8, 2025
30966c8
Fix some lint issues
naemono Aug 8, 2025
cc76108
make generate
naemono Aug 8, 2025
6ddc18c
Fix some linting issues
naemono Aug 8, 2025
4954437
Fix notice
naemono Aug 8, 2025
dcbb908
Move to using types, not strings.
naemono Aug 8, 2025
c8d8099
uncomment a test
naemono Aug 8, 2025
512fb2e
Removing v1b1 pdb logic.
naemono Aug 13, 2025
d37c3d0
Fix unit tests
naemono Aug 14, 2025
f81768e
Fix issue when defining max disruptions allowed.
naemono Aug 14, 2025
357be7c
fixing unit tests
naemono Aug 14, 2025
b6a45c4
Ensure all roles vs coordinating roles is treated properly
naemono Aug 15, 2025
5e1d2f5
make generate
naemono Aug 15, 2025
8bd29c0
remove unneeded version file.
naemono Aug 15, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 11 additions & 3 deletions config/crds/v1/all-crds.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4661,9 +4661,17 @@ spec:
type: array
podDisruptionBudget:
description: |-
PodDisruptionBudget provides access to the default Pod disruption budget for the Elasticsearch cluster.
The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.
In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.
PodDisruptionBudget provides access to the default Pod disruption budget(s) for the Elasticsearch cluster.
The behavior depends on the license level.
With a Basic license:
The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.
In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.
With an Enterprise license:
The default budget is optionally split into multiple budgets, each targeting a specific node role types allowing additional disruptions
for certain roles according to the health status of the cluster.
Example:
All data roles (excluding frozen): allows disruptions only when the cluster is green.
All other roles: allows disruptions only when the cluster is yellow or green.
To disable, set `PodDisruptionBudget` to the empty value (`{}` in YAML).
properties:
metadata:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9239,9 +9239,17 @@ spec:
type: array
podDisruptionBudget:
description: |-
PodDisruptionBudget provides access to the default Pod disruption budget for the Elasticsearch cluster.
The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.
In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.
PodDisruptionBudget provides access to the default Pod disruption budget(s) for the Elasticsearch cluster.
The behavior depends on the license level.
With a Basic license:
The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.
In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.
With an Enterprise license:
The default budget is optionally split into multiple budgets, each targeting a specific node role types allowing additional disruptions
for certain roles according to the health status of the cluster.
Example:
All data roles (excluding frozen): allows disruptions only when the cluster is green.
All other roles: allows disruptions only when the cluster is yellow or green.
To disable, set `PodDisruptionBudget` to the empty value (`{}` in YAML).
properties:
metadata:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4703,9 +4703,17 @@ spec:
type: array
podDisruptionBudget:
description: |-
PodDisruptionBudget provides access to the default Pod disruption budget for the Elasticsearch cluster.
The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.
In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.
PodDisruptionBudget provides access to the default Pod disruption budget(s) for the Elasticsearch cluster.
The behavior depends on the license level.
With a Basic license:
The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.
In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.
With an Enterprise license:
The default budget is optionally split into multiple budgets, each targeting a specific node role types allowing additional disruptions
for certain roles according to the health status of the cluster.
Example:
All data roles (excluding frozen): allows disruptions only when the cluster is green.
All other roles: allows disruptions only when the cluster is yellow or green.
To disable, set `PodDisruptionBudget` to the empty value (`{}` in YAML).
properties:
metadata:
Expand Down
2 changes: 1 addition & 1 deletion docs/reference/api-docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ applies_to:

# {{eck}} API Reference (moved)

This page has moved [here](./api-reference/index.md).
This page has moved [here](./api-reference/index.md).
2 changes: 1 addition & 1 deletion docs/reference/api-reference/main.md
Original file line number Diff line number Diff line change
Expand Up @@ -1093,7 +1093,7 @@ ElasticsearchSpec holds the specification of an Elasticsearch cluster.
| *`transport`* __[TransportConfig](#transportconfig)__ | Transport holds transport layer settings for Elasticsearch. |
| *`nodeSets`* __[NodeSet](#nodeset) array__ | NodeSets allow specifying groups of Elasticsearch nodes sharing the same configuration and Pod templates. |
| *`updateStrategy`* __[UpdateStrategy](#updatestrategy)__ | UpdateStrategy specifies how updates to the cluster should be performed. |
| *`podDisruptionBudget`* __[PodDisruptionBudgetTemplate](#poddisruptionbudgettemplate)__ | PodDisruptionBudget provides access to the default Pod disruption budget for the Elasticsearch cluster.<br>The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.<br>In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.<br>To disable, set `PodDisruptionBudget` to the empty value (`{}` in YAML). |
| *`podDisruptionBudget`* __[PodDisruptionBudgetTemplate](#poddisruptionbudgettemplate)__ | PodDisruptionBudget provides access to the default Pod disruption budget(s) for the Elasticsearch cluster.<br>The behavior depends on the license level.<br>With a Basic license:<br> The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.<br> In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.<br>With an Enterprise license:<br> The default budget is optionally split into multiple budgets, each targeting a specific node role types allowing additional disruptions<br> for certain roles according to the health status of the cluster.<br> Example:<br> All data roles (excluding frozen): allows disruptions only when the cluster is green.<br> All other roles: allows disruptions only when the cluster is yellow or green.<br>To disable, set `PodDisruptionBudget` to the empty value (`{}` in YAML). |
| *`auth`* __[Auth](#auth)__ | Auth contains user authentication and authorization security settings for Elasticsearch. |
| *`secureSettings`* __[SecretSource](#secretsource) array__ | SecureSettings is a list of references to Kubernetes secrets containing sensitive configuration options for Elasticsearch. |
| *`serviceAccountName`* __string__ | ServiceAccountName is used to check access from the current resource to a resource (for ex. a remote Elasticsearch cluster) in a different namespace.<br>Can only be used if ECK is enforcing RBAC on references. |
Expand Down
3 changes: 3 additions & 0 deletions pkg/apis/elasticsearch/v1/elasticsearch_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import (
type NodeRole string

const (
CoordinatingRole NodeRole = ""
DataColdRole NodeRole = "data_cold"
DataContentRole NodeRole = "data_content"
DataFrozenRole NodeRole = "data_frozen"
Expand Down Expand Up @@ -129,6 +130,8 @@ func (n *Node) IsConfiguredWithRole(role NodeRole) bool {
return ptr.Deref(n.Transform, n.IsConfiguredWithRole(DataRole))
case VotingOnlyRole:
return ptr.Deref(n.VotingOnly, false)
case CoordinatingRole:
return len(n.Roles) == 0
}

// This point should never be reached. The default is to assume that a node has all roles except voting_only.
Expand Down
14 changes: 11 additions & 3 deletions pkg/apis/elasticsearch/v1/elasticsearch_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -103,9 +103,17 @@ type ElasticsearchSpec struct {
// +kubebuilder:validation:Optional
UpdateStrategy UpdateStrategy `json:"updateStrategy,omitempty"`

// PodDisruptionBudget provides access to the default Pod disruption budget for the Elasticsearch cluster.
// The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.
// In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.
// PodDisruptionBudget provides access to the default Pod disruption budget(s) for the Elasticsearch cluster.
// The behavior depends on the license level.
// With a Basic license:
// The default budget doesn't allow any Pod to be removed in case the cluster is not green or if there is only one node of type `data` or `master`.
// In all other cases the default PodDisruptionBudget sets `minUnavailable` equal to the total number of nodes minus 1.
// With an Enterprise license:
// The default budget is optionally split into multiple budgets, each targeting a specific node role types allowing additional disruptions
// for certain roles according to the health status of the cluster.
// Example:
// All data roles (excluding frozen): allows disruptions only when the cluster is green.
// All other roles: allows disruptions only when the cluster is yellow or green.
// To disable, set `PodDisruptionBudget` to the empty value (`{}` in YAML).
// +kubebuilder:validation:Optional
PodDisruptionBudget *commonv1.PodDisruptionBudgetTemplate `json:"podDisruptionBudget,omitempty"`
Expand Down
10 changes: 10 additions & 0 deletions pkg/apis/elasticsearch/v1/name.go
Original file line number Diff line number Diff line change
Expand Up @@ -200,3 +200,13 @@ func StackConfigAdditionalSecretName(esName string, secretName string) string {
secretNameHash := hash.HashObject(secretName)
return ESNamer.Suffix(esName, "scp", secretNameHash)
}

// PodDisruptionBudgetNameForRole returns the name of the PodDisruptionBudget for a given Elasticsearch cluster name and role.
func PodDisruptionBudgetNameForRole(esName string, role string) string {
name := DefaultPodDisruptionBudget(esName) + "-" + role
// For coordinating nodes (no roles), append "coordinating" to the name
if role == "" {
name += "coordinating"
}
return name
}
78 changes: 55 additions & 23 deletions pkg/controller/common/statefulset/fixtures.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,24 @@ import (
)

type TestSset struct {
Namespace string
Name string
ClusterName string
Version string
Replicas int32
Master bool
Data bool
Ingest bool
Status appsv1.StatefulSetStatus
ResourceVersion string
Namespace string
Name string
ClusterName string
Version string
Replicas int32
Master bool
Data bool
Ingest bool
ML bool
Transform bool
RemoteClusterClient bool
DataHot bool
DataWarm bool
DataCold bool
DataContent bool
DataFrozen bool
Status appsv1.StatefulSetStatus
ResourceVersion string
}

func (t TestSset) Pods() []client.Object {
Expand Down Expand Up @@ -54,6 +62,14 @@ func (t TestSset) Build() appsv1.StatefulSet {
label.NodeTypesMasterLabelName.Set(t.Master, labels)
label.NodeTypesDataLabelName.Set(t.Data, labels)
label.NodeTypesIngestLabelName.Set(t.Ingest, labels)
label.NodeTypesMLLabelName.Set(t.ML, labels)
label.NodeTypesTransformLabelName.Set(t.Transform, labels)
label.NodeTypesRemoteClusterClientLabelName.Set(t.RemoteClusterClient, labels)
label.NodeTypesDataHotLabelName.Set(t.DataHot, labels)
label.NodeTypesDataWarmLabelName.Set(t.DataWarm, labels)
label.NodeTypesDataColdLabelName.Set(t.DataCold, labels)
label.NodeTypesDataContentLabelName.Set(t.DataContent, labels)
label.NodeTypesDataFrozenLabelName.Set(t.DataFrozen, labels)
statefulSet := appsv1.StatefulSet{
ObjectMeta: metav1.ObjectMeta{
Name: t.Name,
Expand Down Expand Up @@ -86,19 +102,27 @@ func (t TestSset) BuildPtr() *appsv1.StatefulSet {
}

type TestPod struct {
Namespace string
Name string
ClusterName string
StatefulSetName string
Version string
Revision string
Master bool
Data bool
Ingest bool
Ready bool
RestartCount int32
Phase corev1.PodPhase
ResourceVersion string
Namespace string
Name string
ClusterName string
StatefulSetName string
Version string
Revision string
Master bool
Data bool
Ingest bool
ML bool
Transform bool
RemoteClusterClient bool
DataHot bool
DataWarm bool
DataCold bool
DataContent bool
DataFrozen bool
Ready bool
RestartCount int32
Phase corev1.PodPhase
ResourceVersion string
}

func (t TestPod) Build() corev1.Pod {
Expand All @@ -111,6 +135,14 @@ func (t TestPod) Build() corev1.Pod {
label.NodeTypesMasterLabelName.Set(t.Master, labels)
label.NodeTypesDataLabelName.Set(t.Data, labels)
label.NodeTypesIngestLabelName.Set(t.Ingest, labels)
label.NodeTypesMLLabelName.Set(t.ML, labels)
label.NodeTypesTransformLabelName.Set(t.Transform, labels)
label.NodeTypesRemoteClusterClientLabelName.Set(t.RemoteClusterClient, labels)
label.NodeTypesDataHotLabelName.Set(t.DataHot, labels)
label.NodeTypesDataWarmLabelName.Set(t.DataWarm, labels)
label.NodeTypesDataColdLabelName.Set(t.DataCold, labels)
label.NodeTypesDataContentLabelName.Set(t.DataContent, labels)
label.NodeTypesDataFrozenLabelName.Set(t.DataFrozen, labels)

status := corev1.PodStatus{
// assume Running by default
Expand Down
2 changes: 1 addition & 1 deletion pkg/controller/elasticsearch/driver/nodes.go
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,7 @@ func (d *defaultDriver) reconcileNodeSpecs(
}

// Update PDB to account for new replicas.
if err := pdb.Reconcile(ctx, d.Client, d.ES, actualStatefulSets, meta); err != nil {
if err := pdb.Reconcile(ctx, d.Client, d.ES, actualStatefulSets, expectedResources, meta); err != nil {
return results.WithError(err)
}

Expand Down
7 changes: 7 additions & 0 deletions pkg/controller/elasticsearch/elasticsearch_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ import (
"go.elastic.co/apm/v2"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
policyv1 "k8s.io/api/policy/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/types"
"k8s.io/client-go/tools/record"
Expand Down Expand Up @@ -108,6 +109,12 @@ func addWatches(mgr manager.Manager, c controller.Controller, r *ReconcileElasti
return err
}

// Watch PodDisruptionBudgets
if err := c.Watch(
source.Kind(mgr.GetCache(), &policyv1.PodDisruptionBudget{}, handler.TypedEnqueueRequestForOwner[*policyv1.PodDisruptionBudget](mgr.GetScheme(), mgr.GetRESTMapper(), &esv1.Elasticsearch{}, handler.OnlyControllerOwner()))); err != nil {
return err
}

// Watch owned and soft-owned secrets
if err := c.Watch(source.Kind(mgr.GetCache(), &corev1.Secret{}, r.dynamicWatches.Secrets)); err != nil {
return err
Expand Down
2 changes: 1 addition & 1 deletion pkg/controller/elasticsearch/label/label.go
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ func IsMasterNodeSet(statefulSet appsv1.StatefulSet) bool {

// IsDataNodeSet returns true if the given StatefulSet specifies data nodes.
func IsDataNodeSet(statefulSet appsv1.StatefulSet) bool {
return NodeTypesDataLabelName.HasValue(true, statefulSet.Spec.Template.Labels)
return NodeTypesDataLabelName.HasValue(true, statefulSet.Spec.Template.Labels) || NodeTypesDataHotLabelName.HasValue(true, statefulSet.Spec.Template.Labels) || NodeTypesDataColdLabelName.HasValue(true, statefulSet.Spec.Template.Labels) || NodeTypesDataContentLabelName.HasValue(true, statefulSet.Spec.Template.Labels) || NodeTypesDataWarmLabelName.HasValue(true, statefulSet.Spec.Template.Labels)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was required as these other data* node role types should be considered in this function, right?

}

// IsIngestNodeSet returns true if the given StatefulSet specifies ingest nodes.
Expand Down
Loading