Skip to content

Conversation

mk-armah
Copy link
Member

@mk-armah mk-armah commented Jul 22, 2025

User description

Description

What - Added Support For AWS S3 Exporter

Why -

How -

Type of change

Please leave one option from the following and delete the rest:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • New Integration (non-breaking change which adds a new integration)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Non-breaking change (fix of existing functionality that will not change current behavior)
  • Documentation (added/updated documentation)

All tests should be run against the port production environment(using a testing org).

Core testing checklist

  • Integration able to create all default resources from scratch
  • Resync finishes successfully
  • Resync able to create entities
  • Resync able to update entities
  • Resync able to detect and delete entities
  • Scheduled resync able to abort existing resync and start a new one
  • Tested with at least 2 integrations from scratch
  • Tested with Kafka and Polling event listeners
  • Tested deletion of entities that don't pass the selector

Integration testing checklist

  • Integration able to create all default resources from scratch
  • Resync able to create entities
  • Resync able to update entities
  • Resync able to detect and delete entities
  • Resync finishes successfully
  • If new resource kind is added or updated in the integration, add example raw data, mapping and expected result to the examples folder in the integration directory.
  • If resource kind is updated, run the integration with the example data and check if the expected result is achieved
  • If new resource kind is added or updated, validate that live-events for that resource are working as expected
  • Docs PR link here

Preflight checklist

  • Handled rate limiting
  • Handled pagination
  • Implemented the code in async
  • Support Multi account

Screenshots

Include screenshots from your environment showing how the resources of the integration will look.

API Documentation

Provide links to the API documentation used for this integration.


PR Type

Enhancement


Description

  • Add S3 resource exporter with standalone architecture

  • Replace CloudControl sync with declarative exporters

  • Implement S3 bucket inspection with detailed attributes

  • Refactor core utilities and interfaces


Diagram Walkthrough

flowchart LR
  A["CloudControl Sync"] --> B["Declarative Exporters"]
  B --> C["S3 Bucket Exporter"]
  C --> D["S3 Inspector"]
  D --> E["S3 Actions"]
  F["Core Utils"] --> G["Helper Utils"]
  H["Interfaces"] --> I["Action Interface"]
  H --> J["Exporter Interface"]
Loading

File Walkthrough

Relevant files

@mk-armah mk-armah changed the title [Integration[AWS] Add Support For S3 Resource As Standalone Exporter [Integration][AWS] Add Support For S3 Resource As Standalone Exporter Jul 22, 2025
@qodo-merge-pro qodo-merge-pro bot changed the title [Integration][AWS] Add Support For S3 Resource As Standalone Exporter [Integration[AWS] Add Support For S3 Resource As Standalone Exporter Jul 22, 2025
Copy link
Contributor

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ Recommended focus areas for review

Error Handling

The _run_action method catches all exceptions and returns an empty dict, which could mask important errors and make debugging difficult. Consider more specific exception handling or at least logging the exception details.

async def _run_action(self, action: IAction, bucket_name: str) -> Dict[str, Any]:
    try:
        data = await action.execute(bucket_name)
    except Exception as e:
        logger.warning(f"{action.__class__.__name__} failed: {e}")
        return {}
    return data
Logic Issue

The function returns after the first successful region in the loop, but continues to the next region on access denied. This could lead to incomplete resource collection if some regions succeed and others fail with different errors.

    return
except Exception as e:
    if is_access_denied_exception(e):
        logger.warning(
            f"Access denied in region '{region}' for kind '{kind}', skipping."
        )
        continue
    else:
        raise e
Resource Leak

The __aexit__ method calls __aexit__ on the client but doesn't properly handle the context manager cleanup. The client context manager should be properly exited using the stored context manager reference.

async def __aexit__(self, exc_type: Any, exc: Any, tb: Any) -> None:
    if self._base_client:
        await self._base_client.__aexit__(exc_type, exc, tb)

Copy link
Contributor

qodo-merge-pro bot commented Jul 22, 2025

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix context manager cleanup

The aexit method should call the context manager's aexit method instead
of the client's. The client context manager is stored in self._client_cm and
should be properly closed.

integrations/aws-v3/aws/core/client/proxy.py [31-33]

 async def __aexit__(self, exc_type: Any, exc: Any, tb: Any) -> None:
-    if self._base_client:
-        await self._base_client.__aexit__(exc_type, exc, tb)
+    if hasattr(self, '_client_cm'):
+        await self._client_cm.__aexit__(exc_type, exc, tb)
  • Apply / Chat
Suggestion importance[1-10]: 9

__

Why: The suggestion correctly identifies a resource leak bug where __aexit__ is called on the client object instead of the client context manager, preventing proper resource cleanup.

High
General
Add warning for inaccessible regions

The function returns after the first successful region, but if all regions fail
with access denied, no warning is logged. Add a final check to warn when no
regions are accessible.

integrations/aws-v3/main.py [16-35]

 async def _handle_global_resource_resync(
     kind: str,
     regions: List[str],
     options_factory: Callable[[str], Any],
     exporter: IResourceExporter,
 ) -> ASYNC_GENERATOR_RESYNC_TYPE:
     for region in regions:
         try:
             options = options_factory(region)
             async for batch in exporter.get_paginated_resources(options):
                 yield batch
             return
         except Exception as e:
             if is_access_denied_exception(e):
                 logger.warning(
                     f"Access denied in region '{region}' for kind '{kind}', skipping."
                 )
                 continue
             else:
                 raise e
+    logger.warning(f"No accessible regions found for kind '{kind}'")
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion improves observability by adding a warning log if all regions fail due to access issues, which is a useful addition for debugging permissions.

Low
  • Update

@mk-armah mk-armah changed the title [Integration[AWS] Add Support For S3 Resource As Standalone Exporter [Integration][AWS] Add Support For S3 Resource As Standalone Exporter Jul 22, 2025
Copy link
Contributor

@shalev007 shalev007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks beautiful I really like the OOP approach, but I missing the tests since we already implemented a DI to all the classes

Comment on lines 44 to 46
async def inspect_bucket(bucket_name: str) -> dict[str, Any]:
s3_bucket: S3Bucket = await inspector.inspect(bucket_name)
return s3_bucket.dict()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an internal function?

Comment on lines 5 to 15
class S3BucketBuilder:
def __init__(self, name: str) -> None:
self._bucket = S3Bucket(Identifier=name, Properties=S3BucketProperties())

def with_data(self, data: Dict[str, Any]) -> Self:
for k, v in data.items():
setattr(self._bucket.Properties, k, v)
return self

def build(self) -> S3Bucket:
return self._bucket
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feels kinda general, might be a good idea to abstract it so we won't have to repeat it on every resource

from typing import Optional, Dict, Any, List
from pydantic import BaseModel, Field

# https://docs.aws.amazon.com/AWSCloudFormation/latest/TemplateReference/aws-resource-s3-bucket.html#aws-resource-s3-bucket-syntax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we strictly following the CloudControl properties?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are trying following CloudFormation's recommendation to packages implementing API composition (e.g Cloud Control) should structure the models

Comment on lines +6 to +29
def is_access_denied_exception(e: Exception) -> bool:
access_denied_error_codes = [
"AccessDenied",
"AccessDeniedException",
"UnauthorizedOperation",
]
response = getattr(e, "response", None)
if isinstance(response, dict):
error_code = response.get("Error", {}).get("Code")
return error_code in access_denied_error_codes
return False


def is_resource_not_found_exception(e: Exception) -> bool:
resource_not_found_error_codes = [
"ResourceNotFoundException",
"ResourceNotFound",
"ResourceNotFoundFault",
]
response = getattr(e, "response", None)
if isinstance(response, dict):
error_code = response.get("Error", {}).get("Code")
return error_code in resource_not_found_error_codes
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like

@github-actions github-actions bot added size/XXL and removed size/XL labels Aug 7, 2025
@github-actions github-actions bot added size/XL and removed size/XXL labels Aug 8, 2025
@mk-armah mk-armah changed the title [Integration][AWS] Add Support For S3 Resource As Standalone Exporter [Integration][AWS-V2] Add Support For S3 Resource As Standalone Exporter Aug 8, 2025
Comment on lines 3 to 10


class ExporterOptions(BaseModel):
region: str = Field(..., description="The AWS region to export resources from")
include: List[str] = Field(
default_factory=list,
description="The resources to include in the export",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a global thing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right!

Comment on lines 22 to 23
defaults: List[Type[IAction]]
optional: List[Type[IAction]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we separate them if we merge them when used?

Comment on lines 11 to 12
class S3BucketInspector:
"""A Facade for inspecting S3 buckets."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it should be a global thing / inherit an abstract or something

Comment on lines 41 to 43
except Exception as e:
logger.warning(f"{action.__class__.__name__} failed: {e}")
return {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 - maybe it'll be worthwhile to save the errors for later as well, that'll give us more control on what to do with them later (showing them to clients in the logs, reporting back to port, remediations, etc...)

Comment on lines 31 to 35
if is_access_denied_exception(e):
logger.warning(
f"Access denied in region '{region}' for kind '{kind}', skipping."
)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm just confused from all the code, but I don't think errors will end up here since they're being caught internally in the inspector

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right

description = "AWS"
authors = ["Shariff Mohammed <mohammed.s@getport.io>"]
authors = ["Shariff Mohammed <mohammed.s@getport.io>", "Michael Armah <mikeyarmah@gmail.com>"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@github-actions github-actions bot added size/XXL and removed size/XL labels Aug 15, 2025
- Replace `IAction` with `Action` interface
- Update from S3-specific classes to generic `ResourceInspector` and models
- Fix imports and class names in all test files
- Delete redundant test files (`test_bucket_builder.py`, `test_bucket_inspector.py`)
- Restore comprehensive model testing in `test_options_and_models.py`

All tests now passing with new architecture.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants