-
Notifications
You must be signed in to change notification settings - Fork 596
[Feature] add dealer manager to reuse the connection #3471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a dealer connection manager to enable connection reuse, reducing file handle consumption in dealer-type connections. The implementation includes connection pooling with load balancing, environment variable configuration for connection limits, and integration into the OpenAI serving endpoints.
Key changes:
- Added
DealerConnectionManager
class with connection pooling and load balancing - Updated environment configuration to support configurable connection limits
- Integrated connection manager into chat and completion serving endpoints
Reviewed Changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
test/entrypoints/openai/test_dealer_connection_manager.py |
Comprehensive test suite for the new connection manager |
fastdeploy/envs.py |
Made FD_SUPPORT_MAX_CONNECTIONS configurable via environment variable |
fastdeploy/entrypoints/openai/utils.py |
Core connection manager implementation with pooling and load balancing |
fastdeploy/entrypoints/openai/serving_completion.py |
Integration of connection manager into completion endpoints |
fastdeploy/entrypoints/openai/serving_chat.py |
Integration of connection manager into chat endpoints |
fastdeploy/entrypoints/openai/api_server.py |
Added connection manager cleanup in application lifespan |
fastdeploy/entrypoints/engine_client.py |
Initialization of connection manager and semaphore |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
async with self.lock: | ||
if request_id in self.request_map: | ||
await self.request_map[request_id].put(response) | ||
if response[-1]["finished"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect access to request_id. Based on the test at line 79 in the test file, the response structure should be accessed as response["request_id"]
not response[-1]["request_id"]
.
if response[-1]["finished"]: | |
request_id = response["request_id"] | |
async with self.lock: | |
if request_id in self.request_map: | |
await self.request_map[request_id].put(response) | |
if response["finished"]: |
Copilot uses AI. Check for mistakes.
dealer = await aiozmq.create_zmq_stream(zmq.DEALER, connect=f"ipc:///dev/shm/router_{self.pid}.ipc") | ||
await self._ensure_connection_manager() | ||
dealer, response_queue = await self.engine_client.connection_manager.get_connection(request_id) | ||
dealer.write([b"", request_id.encode("utf-8")]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This write operation appears to be misplaced. The write should happen after the loop that creates multiple request IDs (lines 284-290), not before it.
dealer.write([b"", request_id.encode("utf-8")]) |
Copilot uses AI. Check for mistakes.
@@ -90,6 +92,11 @@ def __init__( | |||
suffix=pid, | |||
create=False, | |||
) | |||
self.semaphore = StatefulSemaphore((envs.FD_SUPPORT_MAX_CONNECTIONS + workers - 1) // workers) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The workers
parameter is not defined in the init method. This will cause a NameError when initializing the EngineClient.
Copilot uses AI. Check for mistakes.
Description:
Added Dealer Connection Manager to support connection reuse for dealers, reducing the number of open files.
2.Environment Variables
FD_SUPPORT_MAX_CONNECTIONS
: Controls the actual handle connections of the service (default: 1024).FD_DEALER_CONNECTIONS
: Sets the dealer-specific initial connection pool size (default: 50).