Skip to content

Commit 44645ef

Browse files
committed
feat: Addressed Review comments
Addressed Review comments Signed-off-by: Aswin A <aswin6303@gmail.com>
1 parent e5cf3d0 commit 44645ef

File tree

1 file changed

+62
-54
lines changed

1 file changed

+62
-54
lines changed

docs/entityoperator.md

Lines changed: 62 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,46 @@
1-
# Impact of Entity Operator Availability in a Stretch Kafka Cluster
1+
# Impact of Entity Operator Availability in a Stretched Kafka Cluster
22

3-
The Entity Operator in Strimzi is responsible for managing Kafka users and topics. It automates the creation, configuration, and security settings of these entities, ensuring smooth integration with Kafka clusters deployed via Strimzi. This document explains how its availability affects topic and user management when deployed in a multi-cluster Kafka setup.
3+
This document outlines the role of the Strimzi Entity Operator in managing Kafka users and topics and explains how its availability affects operations within a multi-cluster Kafka deployment where the Entity Operator resides in a central Kubernetes cluster.
44

5-
## Key Components of Entity Operator
5+
## Key Components of the Entity Operator
66

7-
The Entity Operator consists of two main sub-components:
7+
The Entity Operator in Strimzi comprises two primary sub-components:
88

99
### Topic Operator
1010

11-
- Watches for KafkaTopic CRs in Kubernetes.
12-
- Automatically creates, updates, and deletes topics in Kafka based on KafkaTopic CR definitions.
13-
- Keeps Kubernetes and Kafka topic configurations in sync.
14-
- Ensures desired state consistency between Kubernetes and Kafka.
11+
- Monitors Kubernetes for `KafkaTopic` CRs.
12+
- Automatically creates, updates, and deletes Kafka topics within the Kafka cluster based on the definitions in the `KafkaTopic` CRs.
13+
- Ensures synchronization between the desired topic configurations in Kubernetes and the actual topic configurations in Kafka.
1514

1615
### User Operator
1716

18-
- Watches for KafkaUser CRs in Kubernetes.
19-
- Manages security credentials (TLS certificates, SASL credentials).
20-
- Ensures user permissions and authentication are correctly configured.
17+
- Monitors Kubernetes for `KafkaUser` CRs.
18+
- Manages security credentials (e.g., TLS certificates, SASL credentials) and configures user permissions and authentication within the Kafka cluster.
19+
- Simplifies and automates the management of user access and security settings.
2120

22-
## Why is the Entity Operator Useful?
21+
## Why is the Entity Operator Essential?
2322

24-
- Eliminates the need for manual topic and user management.
25-
- Ensures Kafka users have appropriate authentication and authorization settings.
26-
- Enables declarative management using Kubernetes CRs.
27-
- Keeps configurations between Kubernetes and Kafka in sync.
23+
The Entity Operator provides several key benefits:
2824

29-
## How Client Applications Use KafkaTopic and KafkaUser CRs in Strimzi
25+
- Eliminates the need for manual topic and user management through Kafka's administrative tools.
26+
- Ensures Kafka users are configured with appropriate authentication and authorization settings as defined in Kubernetes.
27+
- Enables the management of Kafka resources using Kubernetes-native Custom Resources, promoting a declarative approach.
28+
- Maintains consistency between the desired state in Kubernetes and the actual state of topics and users in Kafka.
3029

31-
The client applications interact with Kafka topics and users in Strimzi using Kubernetes native resources
30+
## How Client Applications Utilize KafkaTopic and KafkaUser CRs in Strimzi
3231

33-
- KafkaTopic CRs define and manage Kafka topics.
34-
- KafkaUser CRs define users and security credentials for authentication & authorization.
32+
Client applications interact with Kafka topics and users in Strimzi using Kubernetes Custom Resources:
3533

36-
## How Applications Use KafkaTopic CRs
34+
- **`KafkaTopic` CRs:** Define and manage Kafka topics, specifying parameters like partitions, replication factor, and configuration.
35+
- **`KafkaUser` CRs:** Define users and their security configurations for authentication and authorization.
36+
37+
## How Applications Utilize `KafkaTopic` CRs
3738

3839
### Creating a Topic
3940

40-
Developers define a topic declaratively using a KafkaTopic CR. The Topic Operator ensures this topic is created in Kafka.
41+
Developers define Kafka topics declaratively using `KafkaTopic` CRs. The Topic Operator ensures the creation of these topics within the Kafka cluster.
4142

42-
**Example KafkaTopic CR**
43+
**Example `KafkaTopic` CR:**
4344

4445
```yaml
4546
apiVersion: kafka.strimzi.io/v1beta2
@@ -52,19 +53,19 @@ spec:
5253
partitions: 3
5354
replicas: 2
5455
config:
55-
retention.ms: 86400000 # Data retention for 1 day
56-
segment.bytes: 1073741824 # 1GB segment size
56+
retention.ms: 86400000 # Data retention for 1 day
57+
segment.bytes: 1073741824 # 1GB segment size
5758
```
5859
59-
**How clients use it**
60+
**How Clients Use It**
6061
61-
Once the topic is created, client applications (producers & consumers) can publish and read messages from `my-topic` like any regular Kafka topic.
62+
Once the `my-topic` is created by the Topic Operator, client applications (producers and consumers) can publish and read messages from it as they would with any regular Kafka topic, provided they have the necessary permissions.
6263

63-
## How Applications Use KafkaUser CRs
64+
## How Applications Utilize `KafkaUser` CRs
6465

6566
### Creating a User for Authentication & Authorization
6667

67-
Client applications need a Kafka user to authenticate and communicate securely. A KafkaUser CR defines the user, authentication method (TLS/SCRAM-SHA), and permissions.
68+
Client applications require a Kafka user to authenticate and communicate securely. A KafkaUser CR defines the user, the authentication method (e.g., TLS or SCRAM-SHA), and the permissions they should have.
6869

6970
```yaml
7071
apiVersion: kafka.strimzi.io/v1beta2
@@ -155,6 +156,18 @@ KafkaConsumer<String, String> consumer = new KafkaConsumer<>(props);
155156
156157
```
157158

159+
160+
**Impact on Clients:**
161+
162+
The fact that the User Operator manages credentials and ACLs through Kafka's standard mechanisms means that the availability of the User Operator is crucial for:
163+
164+
- **Creating and managing new user identities within Kafka.**
165+
- **Ensuring that the correct authentication credentials are in place and accessible.**
166+
- **Defining and enforcing authorization rules for user access to topics.**
167+
168+
When the Central cluster (and thus the User Operator) is unavailable, the ability to perform these management tasks is lost, directly impacting the ability of clients to authenticate and operate with the expected level of access. While the underlying Kafka authentication and authorization capabilities exist within the brokers, the management and provisioning through the Kubernetes control plane are disrupted.
169+
170+
158171
### Summary of How Applications Use KafkaTopic & KafkaUser CRs
159172

160173
| Action | Operator Responsible |
@@ -174,41 +187,36 @@ In stretch Kafka deployment, where
174187
✅ The Entity Operator (managing users & topics) runs in the central cluster.
175188

176189

177-
## What About Entity Operator Functions?
178-
The Entity Operator becomes unavailable when the central cluster goes down. However, this does not impact existing Kafka clients directly because
179-
180-
- Kafka clients do not interact with the Entity Operator at runtime.
181-
- User authentication still works as long as secrets (TLS/SCRAM) were distributed to all clusters.
182-
- Topics and ACLs remain intact but cannot be updated or created until the central cluster recovers.
190+
The failure of the central Kubernetes cluster will render the Entity Operator unavailable. This has the following implications for Kafka clients
183191

184-
## What Happens If No Cluster Has KafkaUser and KafkaTopic CRs?
192+
#### Authentication
185193

186-
If the central cluster is the only one hosting KafkaUser and KafkaTopic CRs, then when it goes down:
194+
- Kafka brokers in the surviving member clusters rely on the configured authentication mechanisms and the presence of valid credentials for client authentication.
195+
- If the secrets containing authentication credentials (TLS certificates or SCRAM passwords) are not replicated to the member clusters, brokers in those clusters will be unable to authenticate new client connections.
196+
- Existing client connections that were authenticated before the central cluster failure might remain active for a period, but they will eventually be disconnected due to session timeouts or other factors, and they will fail to re-establish connections without valid authentication.
197+
- Crucially, the management of credentials (e.g., rotation) through the User Operator will be unavailable.
187198

188-
1. User Authentication Risks
199+
#### Authorization
189200

190-
- Kafka brokers in surviving clusters rely on existing secrets for authentication.
191-
- If KafkaUser secrets were only stored in the central cluster and not replicated, brokers in other clusters will be unable to authenticate client requests.
192-
- New client connections will fail since brokers cannot verify credentials.
193-
- Existing client connections may remain active if they were authenticated before the central cluster failure, but they will eventually be disconnected when session timeouts occur.
201+
- The ACLs defined in KafkaUser CRs are configured on the Kafka brokers. These ACLs will generally remain in place.
202+
- However, any new authorization rules or modifications to existing ones defined in KafkaUser CRs cannot be applied because the User Operator is down.
194203

195-
2. Topic Management Limitations
204+
#### Topic Management
196205

197-
- Topics that were already created will continue to exist and function normally.
198-
- Clients can still produce and consume messages only if they are already authenticated before the central cluster failure.
199-
- No new topics can be created or updated since the KafkaTopic CRs and Entity Operator are unavailable.
206+
- Topics that were already created will continue to exist and function normally.
207+
- Clients can continue to produce and consume messages on existing topics if they remain authenticated and authorized.
208+
- No new topics can be created or updated through the Kubernetes-managed KafkaTopic CRs since the Topic Operator is unavailable.
200209

201-
### Mitigation Strategies
210+
### Why the Entity Operator's Absence Impacts Clients
202211

203-
To ensure Kafka clients remain functional even when the central cluster goes down, we should implement the following best practices
212+
While Kafka brokers have their own internal mechanisms for authentication and authorization, the Entity Operator is the component that automates the configuration and management of these features within a Kubernetes environment using KafkaUser and KafkaTopic CRs. When the Entity Operator is unavailable, the declarative management of these critical aspects is lost.
204213

205-
✅ Replicate KafkaUser secrets across all clusters where Kafka brokers exist.
214+
Mitigation Strategies to Ensure Client Functionality During Central Cluster Failure:
206215

207-
- This ensures authentication remains functional even if the central cluster is unavailable.
216+
To enhance the resilience of Kafka clients in the event of a central cluster failure, the following best practices are recommended:
208217

209-
✅ Ensure Kafka brokers cache authentication data where possible(This needs verification).
218+
Replicate KafkaUser Secrets: Ensure that the Kubernetes Secrets containing authentication credentials (TLS certificates or SCRAM passwords) are replicated across all Kubernetes clusters where Kafka brokers are running. This allows brokers in surviving clusters to authenticate clients using the known credentials.
210219

211-
- Some authentication mechanisms (like SCRAM) allow brokers to cache credentials temporarily.
212-
- This can help avoid immediate authentication failures if the central cluster is temporarily down.
220+
✅ Consider Caching Mechanisms (Verification Needed): While not a direct Strimzi feature, exploring if and how Kafka brokers might cache authentication data (e.g., for SCRAM) could potentially mitigate immediate authentication failures during a short central cluster outage. This requires further investigation and verification of Kafka's internal behavior.
213221

214-
Alternatively we can Explore options like KafkaAccess Operator. This reduces dependency on a single cluster for authentication.
222+
Explore Alternative Authentication and Authorization Solutions: Consider solutions like the Kafka Access Operator, which might offer more distributed control over authentication and authorization, reducing the dependency on a single central cluster for these critical functions.

0 commit comments

Comments
 (0)