-
Notifications
You must be signed in to change notification settings - Fork 117
[Integration][GCP] gRPC
Fork Deadlock Prevention
#2027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Integration][GCP] gRPC
Fork Deadlock Prevention
#2027
Conversation
…tegration-resync-continues-indefinitely-when-run-in-multi-process-mode
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here. PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
You are nearing your monthly Qodo Merge usage quota. For more information, please visit here. PR Code Suggestions ✨Explore these optional code suggestions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
integrations/gcp/main.py
Outdated
default_quota = int( | ||
ocean.integration_config.get("search_all_resources_per_minute_quota", 400) | ||
) | ||
effective_quota = max(int(default_quota * 0.8), 1) | ||
|
||
PROJECT_V3_GET_REQUESTS_RATE_LIMITER = PersistentAsyncLimiter.get_limiter( | ||
max_rate=effective_quota | ||
) | ||
PROJECT_V3_GET_REQUESTS_BOUNDED_SEMAPHORE = asyncio.BoundedSemaphore( | ||
effective_quota |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! However, I think changing this to always use the static quota declared in the env might not be well aligned to the GCP integration. In general we use the dynamic quota for gcp and then use this config as fallback.
I will like if we can focus on making the integration use different instance of the grpc if possible or rather than setting the global persistent limiter on start we can defer it till when the first live event comes in and it is needed though I still think we might run into the same problems again.
Let's try to have different instance of the grpc do the call that happens on the main process for the quota used in live event so that way we don't have to fork across process and that should be safe. I don't think there is issue accross sub processes so that might solve the issue.
Let me know your thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a sound advice, @oiadebayo.
My proposed resolution is to make the initial call to the quota API using the ocean http_sync_client
in the main process and allow the rest of the resync and live events use the grpc forks just like they used to.
The issue doesn't occur on resyncs and on live events ingestion.
Sounds good?
User description
Description
What
Why
How
Type of change
Please leave one option from the following and delete the rest:
All tests should be run against the port production environment(using a testing org).
Core testing checklist
Integration testing checklist
examples
folder in the integration directory.Preflight checklist
Screenshots
Include screenshots from your environment showing how the resources of the integration will look.
API Documentation
Provide links to the API documentation used for this integration.
PR Type
Bug fix
Description
Fix gRPC fork deadlock in GCP integration multi-process mode
Replace dynamic quota discovery with conservative defaults
Defer gRPC initialization to child processes
Maintain backward compatibility with configurable settings
Diagram Walkthrough
File Walkthrough
main.py
Defer gRPC initialization with conservative defaults
integrations/gcp/main.py
CHANGELOG.md
Add changelog entry for gRPC fork fix
integrations/gcp/CHANGELOG.md
pyproject.toml
Version bump to 0.1.176
integrations/gcp/pyproject.toml