Skip to content
View Yaxin9Luo's full-sized avatar

Block or report Yaxin9Luo

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Yaxin9Luo/README.md

Trophies

Typing SVG

My long-term goal is to develop intelligent machines capable of perceiving, understanding, and creating multimodal content, such as videos.

Languages and Tools:

Python PyTorch Git Hugging Face DeepSpeed Transformers Diffusers OpenAI API MATLAB

Decoration

📫 How to reach me:

LinkedIn Twitter Email Google Scholar Personal Page

Decoration

GitHub Stats

Decoration

Contribution Statistics:

Decoration

Activity Graph:

Yaxin Luo's GitHub Activity Graph

Pinned Loading

  1. MetaAgentX/OpenCaptchaWorld MetaAgentX/OpenCaptchaWorld Public

    The first web-based benchmark and platform to evaluate visual reasoning and interaction capabilities of MLLM powered agents through diverse and dynamic CAPTCHA puzzles.

    JavaScript 36

  2. Gamma-MOD Gamma-MOD Public

    [ICLR2025] γ -MOD: Mixture-of-Depth Adaptation for Multimodal Large Language Models

    Python 38 3

  3. APL APL Public

    Python 4 1

  4. De-Diffusion De-Diffusion Public

    This is my version of code implementation for the model includes in the paper De-Diffusion Makes Text a Strong Cross-Modal Interface

    Python 9