Skip to content

LLM parameters added to json #107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 17, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion src/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,23 @@ async def respond_with_llm_message(update):
json={
"prompt": prompt,
"n_predict": 1024,
"temperature": 0.7,
"temperature": 0.8,
"top_k": 40,
"top_p": 0.95,
"min_p": 0.05,
"dynatemp_range": 0,
"dynatemp_exponent": 1,
"typical_p": 1,
"xtc_probability": 0,
"xtc_threshold": 0.1,
"repeat_last_n": 64,
"repeat_penalty": 1,
"presence_penalty": 0,
"frequency_penalty": 0,
"dry_multiplier": 0,
"dry_base": 1.75,
"dry_allowed_length": 2,
"dry_penalty_last_n": -1,
Comment on lines +465 to +481
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding documentation and making LLM parameters configurable.

The addition of these LLM parameters provides more fine-grained control over text generation, which is good. However, these values are hardcoded with no explanation of their purpose or why these specific values were chosen. This could make maintenance challenging for future developers.

Consider the following improvements:

  1. Add comments explaining what each parameter does and why these values were selected
  2. Make critical parameters configurable through environment variables (similar to how you handle LLM_API_ADDR)
async def respond_with_llm_message(update):
    """Handle LLM responses when bot is mentioned."""
    message_text = update.message.text
    # Remove bot mention and any punctuation after it
    prompt = re.sub(r'ботяра[^\w\s]*', '', message_text.lower()).strip()

+   # Load LLM parameters from environment variables or use defaults
+   temperature = float(os.getenv("LLM_TEMPERATURE", "0.8"))
+   top_k = int(os.getenv("LLM_TOP_K", "40"))
+   top_p = float(os.getenv("LLM_TOP_P", "0.95"))

    try:
        async with aiohttp.ClientSession() as session:
            async with session.post(
                f"{LLM_API_ADDR}/completion",
                json={
                    "prompt": prompt,
                    "n_predict": 1024,
-                   "temperature": 0.8,
-                   "top_k": 40,
-                   "top_p": 0.95,
+                   "temperature": temperature,  # Controls randomness in text generation
+                   "top_k": top_k,  # Limits token selection to the k most likely tokens
+                   "top_p": top_p,  # Nucleus sampling threshold
                    "min_p": 0.05,
                    "dynatemp_range": 0,
                    "dynatemp_exponent": 1,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"temperature": 0.8,
"top_k": 40,
"top_p": 0.95,
"min_p": 0.05,
"dynatemp_range": 0,
"dynatemp_exponent": 1,
"typical_p": 1,
"xtc_probability": 0,
"xtc_threshold": 0.1,
"repeat_last_n": 64,
"repeat_penalty": 1,
"presence_penalty": 0,
"frequency_penalty": 0,
"dry_multiplier": 0,
"dry_base": 1.75,
"dry_allowed_length": 2,
"dry_penalty_last_n": -1,
async def respond_with_llm_message(update):
"""Handle LLM responses when bot is mentioned."""
message_text = update.message.text
# Remove bot mention and any punctuation after it
prompt = re.sub(r'ботяра[^\w\s]*', '', message_text.lower()).strip()
# Load LLM parameters from environment variables or use defaults
temperature = float(os.getenv("LLM_TEMPERATURE", "0.8"))
top_k = int(os.getenv("LLM_TOP_K", "40"))
top_p = float(os.getenv("LLM_TOP_P", "0.95"))
try:
async with aiohttp.ClientSession() as session:
async with session.post(
f"{LLM_API_ADDR}/completion",
json={
"prompt": prompt,
"n_predict": 1024,
"temperature": temperature, # Controls randomness in text generation
"top_k": top_k, # Limits token selection to the k most likely tokens
"top_p": top_p, # Nucleus sampling threshold
"min_p": 0.05,
"dynatemp_range": 0,
"dynatemp_exponent": 1,
"typical_p": 1,
"xtc_probability": 0,
"xtc_threshold": 0.1,
"repeat_last_n": 64,
"repeat_penalty": 1,
"presence_penalty": 0,
"frequency_penalty": 0,
"dry_multiplier": 0,
"dry_base": 1.75,
"dry_allowed_length": 2,
"dry_penalty_last_n": -1,
},
) as resp:
# …
🤖 Prompt for AI Agents
In src/main.py around lines 465 to 481, the LLM parameters are hardcoded without
any comments explaining their purpose or rationale for chosen values, which
hinders maintainability. Add inline comments for each parameter describing its
function and why the specific value was selected. Additionally, refactor the
code to load critical parameters from environment variables with sensible
defaults, similar to the existing LLM_API_ADDR handling, to make them
configurable without code changes.

"stop": ["</s>", "User:", "Assistant:"],
},
) as response:
Expand Down