Skip to content

Commit 2bf7791

Browse files
committed
feat: Addressed Review comments
Addressed Review comments Signed-off-by: Aswin A <aswin6303@gmail.com>
1 parent e5cf3d0 commit 2bf7791

File tree

1 file changed

+63
-55
lines changed

1 file changed

+63
-55
lines changed

docs/entityoperator.md

Lines changed: 63 additions & 55 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,46 @@
1-
# Impact of Entity Operator Availability in a Stretch Kafka Cluster
1+
# Impact of Entity Operator Availability in a Stretched Kafka Cluster
22

3-
The Entity Operator in Strimzi is responsible for managing Kafka users and topics. It automates the creation, configuration, and security settings of these entities, ensuring smooth integration with Kafka clusters deployed via Strimzi. This document explains how its availability affects topic and user management when deployed in a multi-cluster Kafka setup.
3+
This document outlines the role of the Strimzi Entity Operator in managing Kafka users and topics and explains how its availability affects operations within a multi-cluster Kafka deployment where the Entity Operator resides in a central Kubernetes cluster.
44

5-
## Key Components of Entity Operator
5+
## Key Components of the Entity Operator
66

7-
The Entity Operator consists of two main sub-components:
7+
The Entity Operator in Strimzi comprises two primary sub-components:
88

99
### Topic Operator
1010

11-
- Watches for KafkaTopic CRs in Kubernetes.
12-
- Automatically creates, updates, and deletes topics in Kafka based on KafkaTopic CR definitions.
13-
- Keeps Kubernetes and Kafka topic configurations in sync.
14-
- Ensures desired state consistency between Kubernetes and Kafka.
11+
- Monitors Kubernetes for `KafkaTopic` CRs.
12+
- Automatically creates, updates, and deletes Kafka topics within the Kafka cluster based on the definitions in the `KafkaTopic` CRs.
13+
- Ensures synchronization between the desired topic configurations in Kubernetes and the actual topic configurations in Kafka.
1514

1615
### User Operator
1716

18-
- Watches for KafkaUser CRs in Kubernetes.
19-
- Manages security credentials (TLS certificates, SASL credentials).
20-
- Ensures user permissions and authentication are correctly configured.
17+
- Monitors Kubernetes for `KafkaUser` CRs.
18+
- Manages security credentials (e.g., TLS certificates, SASL credentials) and configures user permissions and authentication within the Kafka cluster.
19+
- Automates the provisioning and synchronization of Kafka user authentication and authorization settings.
2120

22-
## Why is the Entity Operator Useful?
21+
## Why is the Entity Operator Essential?
2322

24-
- Eliminates the need for manual topic and user management.
25-
- Ensures Kafka users have appropriate authentication and authorization settings.
26-
- Enables declarative management using Kubernetes CRs.
27-
- Keeps configurations between Kubernetes and Kafka in sync.
23+
The Entity Operator provides several key benefits:
2824

29-
## How Client Applications Use KafkaTopic and KafkaUser CRs in Strimzi
25+
- Eliminates the need for manual topic and user management through Kafka's administrative tools.
26+
- Ensures Kafka users are configured with appropriate authentication and authorization settings as defined in Kubernetes.
27+
- Enables the management of Kafka resources using Kubernetes-native Custom Resources, promoting a declarative approach.
28+
- Maintains consistency between the desired state in Kubernetes and the actual state of topics and users in Kafka.
3029

31-
The client applications interact with Kafka topics and users in Strimzi using Kubernetes native resources
30+
## How Client Applications Utilize KafkaTopic and KafkaUser CRs in Strimzi
3231

33-
- KafkaTopic CRs define and manage Kafka topics.
34-
- KafkaUser CRs define users and security credentials for authentication & authorization.
32+
Client applications interact with Kafka topics and users in Strimzi using Kubernetes Custom Resources:
3533

36-
## How Applications Use KafkaTopic CRs
34+
- **`KafkaTopic` CRs:** Define and manage Kafka topics, specifying parameters like partitions, replication factor, and configuration.
35+
- **`KafkaUser` CRs:** Define users and their security configurations for authentication and authorization.
36+
37+
## How Applications Utilize `KafkaTopic` CRs
3738

3839
### Creating a Topic
3940

40-
Developers define a topic declaratively using a KafkaTopic CR. The Topic Operator ensures this topic is created in Kafka.
41+
Developers define Kafka topics declaratively using `KafkaTopic` CRs. The Topic Operator ensures the creation of these topics within the Kafka cluster.
4142

42-
**Example KafkaTopic CR**
43+
**Example `KafkaTopic` CR:**
4344

4445
```yaml
4546
apiVersion: kafka.strimzi.io/v1beta2
@@ -52,19 +53,19 @@ spec:
5253
partitions: 3
5354
replicas: 2
5455
config:
55-
retention.ms: 86400000 # Data retention for 1 day
56-
segment.bytes: 1073741824 # 1GB segment size
56+
retention.ms: 86400000 # Data retention for 1 day
57+
segment.bytes: 1073741824 # 1GB segment size
5758
```
5859
59-
**How clients use it**
60+
**How Clients Use It**
6061
61-
Once the topic is created, client applications (producers & consumers) can publish and read messages from `my-topic` like any regular Kafka topic.
62+
Once the `my-topic` is created by the Topic Operator, client applications (producers and consumers) can publish and read messages from it as they would with any regular Kafka topic, provided they have the necessary permissions.
6263

63-
## How Applications Use KafkaUser CRs
64+
## How Applications Utilize `KafkaUser` CRs
6465

6566
### Creating a User for Authentication & Authorization
6667

67-
Client applications need a Kafka user to authenticate and communicate securely. A KafkaUser CR defines the user, authentication method (TLS/SCRAM-SHA), and permissions.
68+
Client applications require a Kafka user to authenticate and communicate securely. A KafkaUser CR defines the user, the authentication method (e.g., TLS or SCRAM-SHA), and the permissions they should have.
6869

6970
```yaml
7071
apiVersion: kafka.strimzi.io/v1beta2
@@ -155,6 +156,18 @@ KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
155156
156157
```
157158

159+
160+
**Impact on Clients:**
161+
162+
The fact that the User Operator manages credentials and ACLs through Kafka's standard mechanisms means that the availability of the User Operator is crucial for:
163+
164+
- Creating and managing new user identities within Kafka.
165+
- Ensuring that the correct authentication credentials are in place and accessible.
166+
- Defining and enforcing authorization rules for user access to topics.
167+
168+
When the Central cluster (and thus the User Operator) is unavailable, the ability to perform these management tasks is lost, directly impacting the ability of clients to authenticate and operate with the expected level of access. While the underlying Kafka authentication and authorization capabilities exist within the brokers, the management and provisioning through the Kubernetes control plane are disrupted. This means that administrators will not be able to create, update, or delete Kafka users and topics, including performing credential rotations and ACL updates.
169+
170+
158171
### Summary of How Applications Use KafkaTopic & KafkaUser CRs
159172

160173
| Action | Operator Responsible |
@@ -174,41 +187,36 @@ In stretch Kafka deployment, where
174187
✅ The Entity Operator (managing users & topics) runs in the central cluster.
175188

176189

177-
## What About Entity Operator Functions?
178-
The Entity Operator becomes unavailable when the central cluster goes down. However, this does not impact existing Kafka clients directly because
179-
180-
- Kafka clients do not interact with the Entity Operator at runtime.
181-
- User authentication still works as long as secrets (TLS/SCRAM) were distributed to all clusters.
182-
- Topics and ACLs remain intact but cannot be updated or created until the central cluster recovers.
183-
184-
## What Happens If No Cluster Has KafkaUser and KafkaTopic CRs?
190+
The failure of the central Kubernetes cluster will render the Entity Operator unavailable. This has the following implications for Kafka clients
185191

186-
If the central cluster is the only one hosting KafkaUser and KafkaTopic CRs, then when it goes down:
192+
#### Authentication
187193

188-
1. User Authentication Risks
194+
- Kafka brokers in the surviving member clusters rely on the configured authentication mechanisms and the presence of valid credentials for client authentication.
195+
- If secrets containing authentication credentials (TLS certificates or SCRAM passwords) are not replicated across all clusters, new client deployments and credential updates will fail. However, existing clients with valid credentials will continue functioning until their credentials expire or require rotation.
196+
- Existing client connections that were authenticated before the central cluster failure might remain active for a period, but they will eventually be disconnected due to session timeouts or other factors, and they will fail to re-establish connections without valid authentication.
197+
- Crucially, the management of credentials (e.g., rotation) through the User Operator will be unavailable.
189198

190-
- Kafka brokers in surviving clusters rely on existing secrets for authentication.
191-
- If KafkaUser secrets were only stored in the central cluster and not replicated, brokers in other clusters will be unable to authenticate client requests.
192-
- New client connections will fail since brokers cannot verify credentials.
193-
- Existing client connections may remain active if they were authenticated before the central cluster failure, but they will eventually be disconnected when session timeouts occur.
199+
#### Authorization
194200

195-
2. Topic Management Limitations
201+
- The ACLs defined in KafkaUser CRs are configured on the Kafka brokers. These ACLs will generally remain in place.
202+
- However, any new authorization rules or modifications to existing ones defined in KafkaUser CRs cannot be applied because the User Operator is down.
203+
- TLS certificates used for authentication expire and rotate periodically. Without the User Operator, expired certificates cannot be renewed, leading to eventual authentication failures.
196204

197-
- Topics that were already created will continue to exist and function normally.
198-
- Clients can still produce and consume messages only if they are already authenticated before the central cluster failure.
199-
- No new topics can be created or updated since the KafkaTopic CRs and Entity Operator are unavailable.
205+
#### Topic Management
200206

201-
### Mitigation Strategies
207+
- Topics that were already created will continue to exist and function normally.
208+
- Clients can continue to produce and consume messages on existing topics if they remain authenticated and authorized.
209+
- Existing topics will continue to function, but administrators cannot modify topic configurations or delete topics through Kubernetes.
210+
- No new topics can be created or updated through the Kubernetes-managed KafkaTopic CRs since the Topic Operator is unavailable.
202211

203-
To ensure Kafka clients remain functional even when the central cluster goes down, we should implement the following best practices
212+
### Why the Entity Operator's Absence Impacts Clients
204213

205-
✅ Replicate KafkaUser secrets across all clusters where Kafka brokers exist.
214+
As outlined in the 'Impact of Central Cluster Failure' section, the unavailability of the Entity Operator disrupts the declarative management of critical aspects like user authentication and topic lifecycle within your Kubernetes environment. This loss of control directly affects the ability of clients to authenticate, access new resources, and manage their connections effectively.
206215

207-
- This ensures authentication remains functional even if the central cluster is unavailable.
216+
### Mitigation Strategies to Enhance Client Functionality During Central Cluster Failure:
208217

209-
✅ Ensure Kafka brokers cache authentication data where possible(This needs verification).
218+
To enhance the resilience of Kafka clients in the event of a central cluster failure, the following best practices are recommended:
210219

211-
- Some authentication mechanisms (like SCRAM) allow brokers to cache credentials temporarily.
212-
- This can help avoid immediate authentication failures if the central cluster is temporarily down.
220+
✅ Replicate KafkaUser Secrets: Ensure that the Kubernetes Secrets containing authentication credentials (TLS certificates or SCRAM passwords) are replicated across all Kubernetes clusters where Kafka brokers are running. This allows brokers in surviving clusters to authenticate clients using the known credentials.
213221

214-
Alternatively we can Explore options like KafkaAccess Operator. This reduces dependency on a single cluster for authentication.
222+
Explore Alternative Authentication and Authorization Solutions: Consider solutions like the Kafka Access Operator, which might offer more distributed control over authentication and authorization, reducing the dependency on a single central cluster for these critical functions.

0 commit comments

Comments
 (0)