This Python script connects to a Galaxy instance using the BioBlend library. It retrieves available tools and outputs their information in JSON format. The connection credentials (URL and API key) are securely managed using a .env
file.
✅ Connect to a Galaxy instance
✅ Retrieve and list available tools
✅ Securely manage API credentials via environment variables
✅ Extendable to use SBERT for further processing
- Python 3.7+
- Access to a Galaxy instance with an API key
- Clone this repository
git clone https://github.com/johnnas12/galaxy_tool_suggestion.git cd galaxy_tool_suggestion
- Create and activate a virtual environment (optional but recommended)
python -m venv venv source venv/bin/activate # On Windows, use venv\Scripts\activate
- Install the required packages
pip install -r requirements.txt
- Create a .env file in the project root directory.
GALAXY_URL=https://usegalaxy.org GALAXY_API_KEY=your_api_key_here
- Usage
python fetch_tools.py
A FastAPI-powered AI agent for suggesting relevant Galaxy tools based on natural language queries using sentence embeddings and cosine similarity.
After cloning this repository go to agents
cd agents
Open configs/config.py and set the following paths:
from pydantic_settings import BaseSettings
class Settings(BaseSettings):
data_to_encode_path: str = "data/tools_metadata.json" # raw data (JSON format)
embeddings_path: str = "embeddings/galaxy_embeddings.npy"
metadata_path: str = "embeddings/galaxy_metadata.json"
After setting your config, run the embedding script to encode your tools' metadata JSON into embeddings (.npy) and save metadata as a JSON file for lookup.
python agents/text_embedding.py
This will:
- Load your raw JSON metadata
- Generate sentence embeddings
- Save embeddings to embeddings/galaxy_embeddings.npy
- Save clean metadata to embeddings/galaxy_metadata.json
Start your API server with:
uvicorn app:app --reload
🔍 Health Check
GET http://127.0.0.1:8000/health
Response
{"status": "ok"}
POST http://127.0.0.1:8000/suggest
Request Body
{
"query": "align sequences to a reference genome",
"top_k": 5
}
Response
{
"results": [
{
"name": "Tool A",
"description": "Performs sequence alignment",
"help": "long help section",
"category": "Alignment",
"score": 0.92
}
]
}