Skip to content

Commit ec0eca6

Browse files
rrywhenshelby-willis-lzernagurogmackeig
authored
Release 1.0.1 (#2)
* Initial commit * Lanz-3565 * optional load balancer * Add compute instance and cloudinit shell script * updates to cloudinit.sh * use compatible compute image for GPU shapes * schema created * networking updates * schema update * update instance with new marketplace image * update instance with fault domain * update default name * update gpu image and hostname * LANZ-3568_doco complete * Corrected a typo in DEPLOYMENT-GUIDE.md * Release date is To Be Determined * update type * fix typo * LANZ-3568 doco for add-nsg-option * Update Documentation * Update Tagging * Update tag namespace * update release notes * LANZ-3781: added DEPLOYMENT-GUIDE.md * included bastion image info * update release notes and examples deploy to click link * update deploy link in deployment guide * fix comments * added known issue for Terraform Fault Domain error * fix release notes --------- Co-authored-by: shelby_willis <shelby.willis@oracle.com> Co-authored-by: erna_guerrero <erna.guerrero@oracle.com> Co-authored-by: Gregg MacKeigan <gregg.mackeigan@oracle.com>
1 parent 8982547 commit ec0eca6

File tree

12 files changed

+186
-53
lines changed

12 files changed

+186
-53
lines changed

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# OCI Landing Zones AI Workloads
22

3-
![Landing Zone logo](ai_transaction_monitoring_workload/images/landing_zone_300.png)
3+
![Landing Zone logo](images/landing_zone_300.png)
44

55
Welcome to the [OCI Landing Zones (OLZ) Community](https://github.com/oci-landing-zones)! OCI Landing Zones simplify onboarding and running on OCI by providing design guidance, best practices, and pre-configured Terraform deployment templates for various architectures and use cases. These enable customers to easily provision a secure tenancy foundation in the cloud along with all required services, and reliably scale as workloads expand.
66

@@ -11,12 +11,12 @@ This repository contains Terraform modules for managing AI workload resources in
1111
## CIS OCI Foundations Benchmark Modules Collection
1212

1313
This repository is part of a broader collection of repositories containing modules that help customers align their OCI implementations with the CIS OCI Foundations Benchmark recommendations:
14-
- [Identity & Access Management](https://github.com/oracle-quickstart/terraform-oci-cis-landing-zone-iam)
15-
- [Networking](https://github.com/oracle-quickstart/terraform-oci-cis-landing-zone-networking)
16-
- [Governance](https://github.com/oracle-quickstart/terraform-oci-cis-landing-zone-governance)
17-
- [Security](https://github.com/oracle-quickstart/terraform-oci-cis-landing-zone-security)
18-
- [Observability & Monitoring](https://github.com/oracle-quickstart/terraform-oci-cis-landing-zone-observability)
19-
- [Secure Workloads](https://github.com/oracle-quickstart/terraform-oci-secure-workloads)
14+
- [Identity & Access Management](https://github.com/oci-landing-zones/terraform-oci-modules-iam)
15+
- [Networking](https://github.com/oci-landing-zones/terraform-oci-modules-networking)
16+
- [Governance](https://github.com/oci-landing-zones/terraform-oci-modules-governance)
17+
- [Security](https://github.com/oci-landing-zones/terraform-oci-modules-security)
18+
- [Observability & Monitoring](https://github.com/oci-landing-zones/terraform-oci-modules-observability)
19+
- [Secure Workloads](https://github.com/oci-landing-zones/terraform-oci-modules-workloads)
2020

2121
The modules in this collection are designed for flexibility, are straightforward to use, and enforce CIS OCI Foundations Benchmark recommendations when possible.
2222

@@ -37,4 +37,4 @@ Please consult the [security guide](SECURITY.md) for our responsible security vu
3737
## License
3838

3939
Copyright (c) 2025 Oracle and/or its affiliates.
40-
Released under the Universal Permissive License v1.0 as shown at <https://oss.oracle.com/licenses/upl/>.
40+
Released under the Universal Permissive License v1.0 as shown at <https://oss.oracle.com/licenses/upl/>.
Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
# AI Transaction Monitoring Workload Deployment Guide
2+
3+
This template shows how to deploy an AI Transaction Monitoring Workload using an OCI Core Landing Zone configuration.
4+
5+
In this template, a single GPU-based compute instance is deployed, optionally with a dedicated application load balancer and backend set.
6+
The following prerequisite resources are assumed to exist prior to deploying this workload:
7+
8+
- a workload compartment for holding the compute instance and block volume
9+
- an application compartment with a private application subnet with a VCN, including a DRG and NAT gateway for outbound access to the internet.
10+
- optionally, a public web subnet for holding a load balancer used in a mesh network environment.
11+
12+
Because this module deploys a Graphical Processing Unit (GPU) based compute instance, the first task is to enable that capacity in your tenancy. See the *Deploy* section of the companion [README](README.md) file for instructions on increasing GPU Service Limits before attempting to deploy.
13+
14+
## Architectural Overlay
15+
16+
Before you can deploy this module, there must be existing tenancy resources (compartments, VCNs, subnets, etc.) that provide a foundation for the workload or application. The information here assumes you will use [Core Landing Zone](https://github.com/oci-landing-zones/terraform-oci-core-landingzone) for the base deployment. Alternatively, you may choose to use [Operating Entities Landing Zone](https://github.com/oci-landing-zones/oci-landing-zone-operating-entities) depending on your organization and workload size or complexity.
17+
18+
The diagram below shows the AI Transaction Monitoring Workload deployed resources as indicated with a yellow background. All other resources are deployed as a prerequisite by Core Landing Zone with the following user supplied options:
19+
20+
- Networking topology: Hub and Spoke model (Hub/DMZ VCN with a DRG), which includes subnets and NSGs for web, indoor firewall and jump host uses. This option also provides a NAT gateway, which is needed for externally available code installed on the compute instance. Those destination URLs are:
21+
- https://download.docker.com/linux/centos/docker-ce.repo
22+
- https://github.com/NVIDIA/NVFlare.git
23+
- https://data.pyg.org/whl/torch-2.4.0+cpu.html
24+
- https://objectstorage.us-ashburn-1.oraclecloud.com/n/ocisateam/b/EllipticPlusPlusDataset/o/TransactionsDataset.zip
25+
- https://github.com/adinadiana1234/transactionmonitoring_notebooks.git
26+
- OCI Firewall in the Hub/DMZ VCN, required for proper routing within the landing zone.
27+
- Bastion jump host, with Oracle Linux 8 STIG image, in the Hub/DMZ VCN, required for SSH access to the compute instance on a private subnet.
28+
- A single Three Tier VCN (spoke) attached to the DRG, which includes an application subnet and its associated NSG.
29+
30+
A separate application compartment is deployed automatically by Core Landing Zone. OCID values from these prerequisite resources are used as input to this module for placement of the compute instance, associated block storage and application load balancer, if included.
31+
32+
![AI-TMS-arch](../images/AI-TMS-arch.png)
33+
34+
The decision on whether or not to use a public load balancer is based on the complexity of your network. Normally, a load balancer is only used for local scaling and to allow internet traffic. With a Hub and Spoke topology, the load balancer facilitates routing to the application in the spoke VCN.
35+
36+
## Default Values
37+
38+
This template has the following parameters set:
39+
40+
| Variable Name | Description | Value | Options |
41+
|---|---|---|---|
42+
| workload\_name | Name of the workload. Default name is TMS. | TMS | Any string |
43+
| workload\_compartment\_ocid | OCID of the existing Workload Compartment. | null | User input |
44+
| app\_subnet\_compartment\_ocid | OCID of the existing Network Compartment. | null | User input |
45+
| app\_subnet\_ocid | OCID of the existing App Subnet. | null | User input |
46+
| app\_nsg\_ocid | OCID of the existing App NSG. | null | User input |
47+
| add\_lb | Whether to deploy a load balancer. If set to true, a load balancer will be deployed and the compute instance will be attached to the backend server. If set to false, the load balancer and backend set will not be created. | false | true, false |
48+
| lb\_subnet\_compartment\_ocid | OCID of the Load Balancer compartment. | null | User input |
49+
| lb\_subnet\_ocid | OCID of the existing LB Subnet. | null | User input |
50+
| compute\_shape | GPU-based shape of the compute instance. | VM.GPU.A10.1 | VM.GPU.A10.1, VM.GPU.A10.2, BM.GPU.A10.4, VM.GPU2.1, BM.GPU2.2, VM.GPU3.1, VM.GPU3.2, VM.GPU3.4, BM.GPU3.8, |
51+
| compute\_boot\_volume\_size | Boot volume size (in GBs) of the compute instance. | 250 | > 250 GB |
52+
| compute\_ssh\_public\_key | Public SSH Key used to access the compute instance. | null | User input |
53+
| compute\_availability\_domain | Availability domain where the compute instance will be deployed. Default is AD-1. | 1 | tenancy AD count (1 or 3) |
54+
| compute\_fault\_domain | Fault domain where the compute instance will be deployed. Default is FD-1. | 1 | tenancy FD count (2) |
55+
| block\_volume\_size | Block volume size (in GBs) to be attached to the compute instance. | 200 | > 200 GB |
56+
57+
For a detailed description of all variables that can be used, see the [SPEC.md](SPEC.md) documentation.
58+
59+
This template can be deployed using OCI Resource Manager Service (RMS) or Terraform CLI:
60+
61+
## OCI RMS Deployment
62+
63+
By clicking the button below, you are redirected to an OCI RMS Stack with variables pre-assigned for deployment.
64+
65+
[![Deploy_To_OCI](../images/DeployToOCI.svg)](https://cloud.oracle.com/resourcemanager/stacks/create?zipUrl=https://github.com/oci-landing-zones/terraform-oci-workloads-ai/archive/refs/heads/main.zip&zipUrlVariables={"workload_name":"TMS","workload_compartment_ocid":"","app_subnet_compartment_ocid":"","app_subnet_ocid":"","app_nsg_ocid":"","add_lb":false,"lb_subnet_compartment_ocid":"","lb_subnet_ocid":"","compute_shape":"VM.GPU.A10.1","compute_boot_volume_size":"250","compute_ssh_public_key":"","compute_availability_domain":"1","compute_fault_domain":"1","block_volume_size":"200"})
66+
67+
You are required to review/adjust the following variable settings:
68+
69+
- Provide existing OCIDs for *workload\_compartment\_ocid*, *app\_subnet\_compartment\_ocid*, *app\_subnet\_ocid*, and if opted for, *lb\_subnet\_compartment\_ocid* and *lb\_subnet\_\_ocid* fields.
70+
- Check *add\_lb* option in case it is desired.
71+
- Make sure to enter the *compute\_ssh\_public\_key* variable with a public SSH key for the compute instance.
72+
- Be sure to adjust *compute\_availability\_domain* and *compute\_fault\_domain* to match your GPU shape availability.
73+
74+
**NOTE:** Terraform Apply will fail if the GPU capacity is not in the indicated Fault Domain. The Fault Domain value is required, but there is no way to verify that value beforehand in the OCI console. You may have to retry the Apply after incrementing the Fault Domain value.
75+
76+
With the stack created, perform a Plan, followed by an Apply using RMS UI.
77+
78+
## Terraform CLI Deployment
79+
80+
1. Rename file *main.tf.template* to *main.tf*.
81+
2. Provide/review the variable assignments in *main.tf*.
82+
3. In this folder, execute the typical Terraform workflow:
83+
84+
``
85+
terraform init
86+
``
87+
88+
``
89+
terraform plan
90+
``
91+
92+
``
93+
terraform apply
94+
``
95+

0 commit comments

Comments
 (0)