Home

🧠 Core Agent Group for MARL-PTM

These agents are biologically distinct, non-redundant, and directly contribute to reinforcement learning dynamics. This group reflects the minimum viable yet powerful set for multimodal PTM prediction.

Agent	Role
Sequence Agent	Detects PTM sequence motifs via protein language models (e.g., ESM-2).
Structure Agent	Evaluates PTM feasibility via 3D structural context (RSA, pLDDT).
Graph Agent	Embeds protein-pathway interactions from biological graphs.
Expression Agent	Adds tissue/disease-specific regulation context via transcriptomics.
Proteoform Agent	Handles isoform-specific domains and PTM accessibility.
PTM Agent	Aggregates all predictions and decides final PTM prediction.
Reward Agent	Generates biologically-informed rewards for all agents.

We intentionally drop auxiliary agents (like co-evolution, domain shift, or LLM) from the core learning loop, keeping the system streamlined and scalable.

📐 Next Step: Outline of the Refined Project

Let’s now prepare the outline of the project based on this refined architecture.

🧬 MARL-PTM (v2) Project Outline

1. Introduction

Motivation for PTM prediction in disease and signaling.
Limitations of single-modality or black-box models.
Proposal: biologically specialized agents coordinated via MARL.

2. Core System Architecture

Overview of agents and their modality.
Diagram of the system showing data → agents → PTM integration → reward loop.
Justification for agent selection (why these 5 modalities?).

3. Data Processing

UniProt sequences → ESM-2 embeddings.
AlphaFold/PDB → RSA, pLDDT scores (via DSSP).
STRING, Reactome → graph construction + GNN encoding.
GTEx/TCGA → PCA + WGCNA modules.
Isoform extraction via UniProt → proteoform domains.

4. Agent Implementation

Sequence Agent (DQN on motif embeddings).
Structure Agent (RL on residue-level RSA/pLDDT).
Graph Agent (policy over node embeddings from pathway graphs).
Expression Agent (reward shaping via expression activation).
Proteoform Agent (isoform gating via match-level reward).
PTM Agent (meta-policy: integrates predictions).
Reward Agent (context-aware, modular reward matrix from table).

5. Reinforcement Learning Pipeline

Environment definition (states, actions, rewards).
Training loop with multi-agent coordination.
Agent-specific replay buffers and target networks.
Convergence monitoring (Q-value stabilization + validation metrics).

6. Evaluation

Benchmarks: F1-score, AUROC, Precision-Recall on held-out test set.
Baselines: Sequence CNN, GNN-only model, deep fusion model.
Ablation studies: turn off one agent at a time.
Interpretability: attention weights, attribution per agent.

7. Results

Performance metrics (with charts).
Case study: Tau, Alpha-Synuclein (Alzheimer’s/Parkinson’s proteins).
Example visualizations of residue-wise predictions with context highlights.

8. Future Directions

Optional agent expansion (e.g., co-evolution, phospho-specific kinases).
Integration of LLM for reward shaping and interpretability.
Scaling to full proteome or single-cell proteomics.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!