Skip to content

A brief introduction to our work, Muddit: #8

@Shi-qingyu

Description

@Shi-qingyu

Thanks for your great work!

We would like to briefly introduce our work:
Muddit is a 1 B‑parameter unified discrete‑diffusion model that processes text and image tokens in a single MM‑DiT backbone. One checkpoint covers text‑to‑image generation, image captioning, and VQA, using 64‑step parallel sampling that naturally supports in‑painting and partial conditioning. Muddit achieves competitive performance for discrete‑token models while matching systems 10x times larger.

Code: https://github.com/M-E-AGI-Lab/Muddit

Hugging Face: https://huggingface.co/MeissonFlow/Muddit

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions