Skip to content

Commit ec2a106

Browse files
author
vijay-stephen
committed
Merge pull request #1 from sourcefuse/feature/aws-terraform-msk-module
Feature/aws terraform msk module
1 parent 5babb62 commit ec2a106

File tree

1 file changed

+257
-0
lines changed
  • docs/arc-iac-docs/modules/terraform-aws-arc-msk/docs/module-usage-guide

1 file changed

+257
-0
lines changed
Lines changed: 257 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,257 @@
1+
# Terraform AWS ARC MSK Module Usage Guide
2+
3+
## Introduction
4+
5+
### Purpose of the Document
6+
7+
This document provides guidelines and instructions for users looking to implement the Terraform AWS ARC MSK (Managed Streaming for Apache Kafka) module.
8+
9+
### Module Overview
10+
11+
The Terraform AWS ARC MSK module provides a secure and modular foundation for deploying Amazon MSK (Managed Streaming for Apache Kafka) clusters on AWS. It supports both standard and serverless MSK clusters with comprehensive configuration options for encryption, authentication, monitoring, and logging.
12+
In addition, this module supports configuring MSK Connect connectors to integrate data sources like Amazon Aurora PostgreSQL and destinations like Amazon S3, enabling real-time data streaming pipelines using custom Kafka Connect plugins.
13+
14+
### Prerequisites
15+
16+
Before using this module, ensure you have the following:
17+
18+
- AWS credentials configured.
19+
- Terraform installed (version > 1.4, < 2.0.0).
20+
- A working knowledge of AWS VPC, Apache Kafka, MSK, and Terraform concepts.
21+
22+
## Getting Started
23+
24+
### Module Source
25+
26+
To use the module in your Terraform configuration, include the following source block:
27+
28+
```hcl
29+
module "msk" {
30+
source = "sourcefuse/arc-msk/aws"
31+
version = "0.0.1"
32+
33+
cluster_type = "provisioned"
34+
cluster_name = "example-msk-cluster"
35+
kafka_version = "3.6.0"
36+
number_of_broker_nodes = 2
37+
broker_instance_type = "kafka.m5.large"
38+
client_subnets = data.aws_subnets.public.ids
39+
security_groups = [module.security_group.id]
40+
broker_storage = {
41+
volume_size = 150
42+
}
43+
44+
client_authentication = {
45+
sasl_scram_enabled = true # When set to true, this will create secrets in AWS Secrets Manager.
46+
allow_unauthenticated_access = false
47+
}
48+
# Enable CloudWatch logging
49+
logging_info = {
50+
cloudwatch_logs_enabled = true
51+
}
52+
53+
# Enable monitoring
54+
monitoring_info = {
55+
jmx_exporter_enabled = true
56+
node_exporter_enabled = true
57+
}
58+
59+
tags = module.tags.tags
60+
}
61+
```
62+
### MSK Connect Data Sink: Aurora PostgreSQL to Amazon S3
63+
64+
#### This Terraform example provisions MSK Connect components that enable data ingestion from an Amazon Aurora PostgreSQL database into Amazon S3, using Kafka Connect and Confluent plugins.
65+
66+
Prerequisites:
67+
68+
Before running the Terraform example in [example/msk-connect](https://github.com/sourcefuse/terraform-aws-arc-msk/blob/feature/fix-docs/examples/msk-connect/main.tf), ensure the following components are pre-configured in your AWS environment:
69+
Aurora PostgreSQL Setup
70+
- An Aurora PostgreSQL cluster is already created.
71+
- A database named myapp is created within the cluster.
72+
- A sample table named users is present under schema public with sample data inserted.
73+
74+
VPC Configuration
75+
- A VPC Endpoint for S3 (Gateway type) is created to allow private communication between MSK Connect and S3.
76+
77+
Plugins Downloaded and Uploaded to S3
78+
79+
Download the required Kafka Connect plugins and upload them to the appropriate S3 bucket:
80+
81+
JDBC Source Plugin
82+
- Plugin: [confluentinc-kafka-connect-jdbc-10.6.6.zip](https://www.confluent.io/hub/confluentinc/kafka-connect-jdbc)
83+
84+
S3 Sink Plugin
85+
- Plugin: [confluentinc-kafka-connect-s3-10.6.6.zip](https://www.confluent.io/hub/confluentinc/kafka-connect-s3)
86+
87+
## Module Overview
88+
89+
Once the above prerequisites are met, you can deploy the Terraform example to configure the data pipeline using:
90+
91+
```hcl
92+
# Source Connector
93+
94+
module "msk_connect" {
95+
source = "sourcefuse/arc-msk/aws"
96+
version = "0.0.1"
97+
98+
# Enables MSK Connect components and plugins for source
99+
create_msk_components = true
100+
create_custom_plugin = true
101+
create_worker_configuration = false
102+
create_connector = true
103+
104+
# Plugin and connector configurations
105+
plugin_name = "jdbc-pg-plugin"
106+
plugin_content_type = "ZIP"
107+
plugin_description = "Custom plugin for MSK Connect"
108+
plugin_s3_bucket_arn = module.s3.bucket_arn
109+
plugin_s3_file_key = "confluentinc-kafka-connect-jdbc-10.6.6.zip"
110+
111+
connector_name = "msk-pg-connector"
112+
kafkaconnect_version = "2.7.1"
113+
114+
connector_configuration = {
115+
"connector.class" : "io.confluent.connect.jdbc.JdbcSourceConnector",
116+
...
117+
"connection.url" : "jdbc:postgresql://${data.aws_ssm_parameter.db_endpoint.value}:5432/myapp"
118+
}
119+
120+
...
121+
}
122+
123+
# Sink Connector
124+
125+
module "msk_s3_sink" {
126+
source = "sourcefuse/arc-msk/aws"
127+
version = "0.0.1"
128+
129+
# Enables MSK Connect components and plugins for destination
130+
create_msk_components = true
131+
create_custom_plugin = true
132+
create_worker_configuration = false
133+
create_connector = true
134+
135+
plugin_name = "s3-sink-plugin"
136+
plugin_content_type = "ZIP"
137+
plugin_description = "Custom plugin for MSK Connect"
138+
plugin_s3_bucket_arn = module.s3.bucket_arn
139+
plugin_s3_file_key = "confluentinc-kafka-connect-s3-10.6.6.zip"
140+
141+
connector_name = "msk-s3-sink-connector"
142+
kafkaconnect_version = "2.7.1"
143+
144+
connector_configuration = {
145+
"connector.class" : "io.confluent.connect.s3.S3SinkConnector",
146+
...
147+
"s3.bucket.name" : module.s3.bucket_id
148+
}
149+
150+
...
151+
}
152+
```
153+
These modules will create MSK Connect plugins and connectors, enabling a seamless stream of data from PostgreSQL (public.users table) to S3 (cdc_aurora_users topic)
154+
155+
Refer to the [Terraform Registry](https://registry.terraform.io/modules/sourcefuse/arc-msk/aws/latest) for the latest version.
156+
157+
### Integration with Existing Terraform Configurations
158+
159+
## Integration with Existing Terraform Configurations
160+
Integrate the module with your existing Terraform mono repo configuration, follow the steps below:
161+
162+
- Create a new folder in terraform/ named msk.
163+
- Create the required files, see the examples to base off of.
164+
- Configure with your backend:
165+
- Create the environment backend configuration file: config.<environment>.hcl
166+
- region: Where the backend resides
167+
- key: <working_directory>/terraform.tfstate
168+
- bucket: Bucket name where the terraform state will reside
169+
- dynamodb_table: Lock table so there are not duplicate tfplans in the mix
170+
- encrypt: Encrypt all traffic to and from the backend
171+
172+
### Required AWS Permissions
173+
174+
Ensure that the AWS credentials used to execute Terraform have the necessary permissions to create, list and modify:
175+
176+
- Amazon MSK clusters and configurations
177+
- IAM roles and policies
178+
- KMS keys (if encryption is enabled)
179+
- CloudWatch logs and metrics
180+
- Security groups and VPC resources
181+
- Secrets Manager resources (for SASL/SCRAM authentication)
182+
183+
## Module Configuration
184+
185+
### Input Variables
186+
187+
For a list of input variables, see the README [Inputs](https://github.com/sourcefuse/terraform-aws-arc-msk#inputs) section.
188+
189+
### Output Values
190+
191+
For a list of outputs, see the README [Outputs](https://github.com/sourcefuse/terraform-aws-arc-msk#outputs) section.
192+
193+
## Module Usage
194+
195+
### Basic Usage
196+
197+
For basic usage, see the [example](https://github.com/sourcefuse/terraform-aws-arc-msk/tree/main/examples/simple) folder.
198+
199+
This example will create:
200+
201+
- An MSK cluster with customizable broker configuration
202+
- Client authentication with SASL/SCRAM
203+
- CloudWatch logging
204+
- Prometheus monitoring with JMX and Node exporters
205+
206+
### Tips and Recommendations
207+
208+
- The module focuses on provisioning secure and scalable MSK clusters. The convention-based approach enables downstream services to easily connect to the Kafka cluster. Adjust the configuration parameters as needed for your specific use case.
209+
- Consider using the storage autoscaling feature for production workloads to handle growing data volumes.
210+
- For high availability, deploy the MSK cluster across multiple availability zones.
211+
- Use appropriate authentication methods (SASL/SCRAM, IAM, TLS) based on your security requirements.
212+
- Enable monitoring and logging for better observability and troubleshooting.
213+
214+
## Troubleshooting
215+
216+
### Reporting Issues
217+
218+
If you encounter a bug or issue, please report it on the [GitHub repository](https://github.com/sourcefuse/terraform-aws-arc-msk/issues).
219+
220+
## Security Considerations
221+
222+
### AWS VPC
223+
224+
Understand the security considerations related to MSK on AWS when using this module:
225+
- MSK clusters should be deployed in private subnets with appropriate security groups.
226+
- Use encryption in transit and at rest for sensitive data.
227+
- Implement proper authentication mechanisms (SASL/SCRAM, IAM, TLS).
228+
229+
### Best Practices for AWS MSK
230+
231+
Follow best practices to ensure secure MSK configurations:
232+
233+
- [AWS MSK Security Best Practices](https://docs.aws.amazon.com/msk/latest/developerguide/security-best-practices.html)
234+
- Enable encryption in transit and at rest
235+
- Use IAM authentication or SASL/SCRAM for client authentication
236+
- Implement proper network isolation using security groups
237+
- Regularly update Kafka versions to benefit from security patches
238+
239+
## Contributing and Community Support
240+
241+
### Contributing Guidelines
242+
243+
Contribute to the module by following the guidelines outlined in the [CONTRIBUTING.md](https://github.com/sourcefuse/terraform-aws-arc-msk/blob/main/CONTRIBUTING.md) file.
244+
245+
### Reporting Bugs and Issues
246+
247+
If you find a bug or issue, report it on the [GitHub repository](https://github.com/sourcefuse/terraform-aws-arc-msk/issues).
248+
249+
## License
250+
251+
### License Information
252+
253+
This module is licensed under the Apache 2.0 license. Refer to the [LICENSE](https://github.com/sourcefuse/terraform-aws-arc-msk/blob/main/LICENSE) file for more details.
254+
255+
### Open Source Contribution
256+
257+
Contribute to open source by using and enhancing this module. Your contributions are welcome!

0 commit comments

Comments
 (0)