diff --git a/README.md b/README.md
index 3fe90a545..cc7d1377b 100644
--- a/README.md
+++ b/README.md
@@ -1,8 +1,7 @@
-# LLMC: Towards Accurate and Efficient LLM Compression
+<div align="center" style="font-family: charter;">
+<h1> LLMC: Towards Accurate and Efficient LLM Compression </h1>
 
-<img src="./imgs/llmc.png" alt="llmc" style="zoom:35%;" />
-
-<div align="center">
+<img src="./imgs/llmc.png" alt="llmc" width="75%" />
 
 [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
 [![arXiv](https://img.shields.io/badge/LLMC-2405.06001-b31b1b)](https://arxiv.org/abs/2405.06001)
@@ -11,7 +10,7 @@
 [![Discord Banner](https://img.shields.io/discord/1139835312592392214?logo=discord&logoColor=white)](https://discord.com/invite/NfJzbkK3jY)
 [![QQ](https://img.shields.io/badge/QQ-EB1923?logo=tencent-qq&logoColor=white)](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592)
 [![Doc](https://img.shields.io/badge/docs-English-99cc2)](https://llmc-en.readthedocs.io/en/latest/)
-[![Doc](https://img.shields.io/badge/文档-中文-99cc2)](https://llmc-zhcn.readthedocs.io/en/latest/)
+[![Doc](https://img.shields.io/badge/文档-中文-99cc2)](https://llmc-zhcn.readthedocs.io/en/latest/)&#160;
 
 **\[ English | [中文](README_zh.md) | [日本語](README_ja.md) \]**
 
@@ -27,21 +26,15 @@ docker pull llmcompression/llmc:pure-latest
 docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-latest
 ```
 
-**Community**:
-
-- [Discord Server](https://discord.com/invite/NfJzbkK3jY)
-- [Tencent QQ Group](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592)
+**Community**: [Discord Server](https://discord.com/invite/NfJzbkK3jY), [Tencent QQ Group](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592).
 
-**Docs**:
+**Docs**: [English](https://llmc-en.readthedocs.io/en/latest/), [Chinese](https://llmc-zhcn.readthedocs.io/en/latest/).
 
-- [English](https://llmc-en.readthedocs.io/en/latest/)
-- [Chinese](https://llmc-zhcn.readthedocs.io/en/latest/)
-
-## Latest News
+## :fire: Latest News
 
 - **May 12, 2025:** 🔥 We now fully support quantization for the **`Wan2.1`** series of video generation models and provide export of truly quantized **INT8/FP8** weights, compatible with the [lightx2v](https://github.com/ModelTC/lightx2v) inference framework. For details, please refer to the [lightx2v documentation](https://llmc-en.readthedocs.io/en/latest/backend/lightx2v.html).
 
-- **Feb 7, 2025:** 🔥 We now fully support quantization of large-scale **`MOE`** models like **`DeepSeekv3`**, **`DeepSeek-R1`**, and **`DeepSeek-R1-zero`** with **`671B`** parameters. You can now directly load FP8 weights without any extra conversion. AWQ and RTN quantization can run on a single 80GB GPU, and we also support the export of true quantized **INT4/INT8** weights.
+- **Feb 07, 2025:** 🔥 We now fully support quantization of large-scale **`MOE`** models like **`DeepSeekv3`**, **`DeepSeek-R1`**, and **`DeepSeek-R1-zero`** with **`671B`** parameters. You can now directly load FP8 weights without any extra conversion. AWQ and RTN quantization can run on a single 80GB GPU, and we also support the export of true quantized **INT4/INT8** weights.
 
 - **Nov 20, 2024:** 🔥 We now fully support the quantization of ✨`DeepSeekv2(2.5)` and other `MOE` models, as well as ✨`Qwen2VL`, `Llama3.2`, and other `VLM` models. Supported quantization methods include ✅integer quantization, ✅floating-point quantization, and advanced algorithms like ✅AWQ, ✅GPTQ, ✅SmoothQuant, and ✅Quarot.
 
@@ -49,14 +42,17 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
 - **Sep 26, 2024:** 🔥 We now support exporting 💥`FP8 quantized(E4M3, E5M2)` models from 🚀`LLMC` to advanced inference backends such as [VLLM](https://github.com/vllm-project/vllm) and [SGLang](https://github.com/sgl-project/sglang). For detailed usage, please refer to the [VLLM documentation](https://llmc-en.readthedocs.io/en/latest/backend/vllm.html) and [SGLang documentation](https://llmc-en.readthedocs.io/en/latest/backend/sglang.html).
 
+<details close>
+<summary>Previous News</summary>
+
 - **Sep 24, 2024:** 🔥 We have officially released ✅INT4 and ✅INT8 models of ✨`Llama-3.1-405B`, quantized using 🚀`LLMC` in `save_lightllm` mode. You can download the model parameters [here](https://huggingface.co/Dongz/llama31-405b-quant).
 
 - **Sep 23, 2024:** 🔥 We now support exporting ✨`real quantized(INT4, INT8)` models from 🚀`LLMC` to advanced inference backends such as [VLLM](https://github.com/vllm-project/vllm), [SGLang](https://github.com/sgl-project/sglang), [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), and [MLC-LLM](https://github.com/mlc-ai/mlc-llm) for quantized inference deployment, enabling ✨`reduced memory usage` and ✨`faster inference speeds`.
   For detailed usage, please refer to the [VLLM documentation](https://llmc-en.readthedocs.io/en/latest/backend/vllm.html), [SGLang documentation](https://llmc-en.readthedocs.io/en/latest/backend/sglang.html), [AutoAWQ documentation](https://llmc-en.readthedocs.io/en/latest/backend/autoawq.html), and [MLC-LLM documentation](https://llmc-en.readthedocs.io/en/latest/backend/mlcllm.html).
 
-- **Sep 9, 2024:** 🔥 We provide some configs of our best practice towards superior performance (see Best Practice [here](https://llmc-en.readthedocs.io/en/latest/)).
+- **Sep 09, 2024:** 🔥 We provide some configs of our best practice towards superior performance (see Best Practice [here](https://llmc-en.readthedocs.io/en/latest/)).
 
-* **Sep 3, 2024:** 🔥 We support [opencompass](https://github.com/open-compass/opencompass) 🤗 to eval 🚀`LLMC` model. Follow this [doc](https://llmc-en.readthedocs.io/en/latest/advanced/model_test_v2.html) and have a try!
+* **Sep 03, 2024:** 🔥 We support [opencompass](https://github.com/open-compass/opencompass) 🤗 to eval 🚀`LLMC` model. Follow this [doc](https://llmc-en.readthedocs.io/en/latest/advanced/model_test_v2.html) and have a try!
 
 * **Aug 22, 2024:** 🔥We support lots of small language models, including current SOTA [SmolLM](https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966)(see [Supported Model List](#supported-model-list)).
 
@@ -70,9 +66,6 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
   (\* denotes equal contribution, 📧 denotes corresponding author.)
 
-<details close>
-<summary>Previous News</summary>
-
 - **Jul 16, 2024:** 🔥We support Wanda/Naive(Magnitude) for llm sparsification and layer-wise mix bits quantization now!
 
 - **Jul 14, 2024:** 🔥We support rotation based quantization QuaRot now!
@@ -95,11 +88,11 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
   on the calibration data, algorithm pipeline, and quantization configuration selection. Based on the takeaways, a best practice for the LLM PTQ pipeline is designed, to achieve the best accuracy and efficiency performance balance
   under various scenarios.
 
-- **Mar 7, 2024:** 🚀 We release the quantization part of a powerful and efficient LLM compression tool. Notably, our benchmark paper is coming soon😊.
+- **Mar 07, 2024:** 🚀 We release the quantization part of a powerful and efficient LLM compression tool. Notably, our benchmark paper is coming soon😊.
 
 </details>
 
-## Highlight Feature
+## 🚀 Highlight Feature
 
 - 💥**Comprehensive Algorithm Support**: Provides a broad range of ✨`SOTA compression algorithms`, including ✅quantization, ✅mixed-precision quantization, and ✅sparsity, while maintaining accuracy consistent with the original repositories. ✨`Quantization best practices` (see 🚀`Best Practices` [here](https://llmc-en.readthedocs.io/en/latest/)) are also available to ensure optimal performance and efficiency.
 
@@ -111,175 +104,131 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
 - 💥**Performance Efficiency**: Enables quantization of large LLMs, such as ✨`Llama3.1-405B` and ✨`DeepSeek-R1-671B`, with PPL evaluation on a `single A100/H100/H800 GPU`.
 
-## Usage
+## ⚙️ Usage
 
 Please refer to the 🚀`Quick Start` section in the [documentation](https://llmc-en.readthedocs.io/en/latest/).
 
-## Supported Model List
-
-✅ [BLOOM](https://huggingface.co/bigscience/bloom)
-
-✅ [LLaMA](https://github.com/facebookresearch/llama)
-
-✅ [LLaMA V2](https://huggingface.co/meta-llama)
-
-✅ [StarCoder](https://github.com/bigcode-project/starcoder)
-
-✅ [OPT](https://huggingface.co/docs/transformers/model_doc/opt)
-
-✅ [Falcon](https://huggingface.co/docs/transformers/model_doc/falcon)
-
-✅ [InternLM2](https://huggingface.co/internlm)
-
-✅ [Mistral](https://huggingface.co/docs/transformers/model_doc/mistral)
-
-✅ [LLaMA V3](https://huggingface.co/meta-llama)
-
-✅ [Mixtral](https://huggingface.co/docs/transformers/model_doc/mixtral)
-
-✅ [Qwen V2](https://github.com/QwenLM/Qwen2)
-
-✅ [LLaVA](https://github.com/haotian-liu/LLaVA)
-
-✅ [InternLM2.5](https://huggingface.co/internlm)
-
-✅ [StableLM](https://github.com/Stability-AI/StableLM)
-
-✅ [Gemma2](https://huggingface.co/docs/transformers/main/en/model_doc/gemma2)
-
-✅ [Phi2](https://huggingface.co/microsoft/phi-2)
-
-✅ [Phi 1.5](https://huggingface.co/microsoft/phi-1_5)
-
-✅ [MiniCPM](https://github.com/OpenBMB/MiniCPM)
-
-✅ [SmolLM](https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966)
-
-✅ [DeepSeekv2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5)
-
-✅ [LLaMA V3.2 Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)
+## :robot: Supported Model List
+
+- ✅ [BLOOM](https://huggingface.co/bigscience/bloom)
+- ✅ [LLaMA](https://github.com/facebookresearch/llama)
+- ✅ [LLaMA V2](https://huggingface.co/meta-llama)
+- ✅ [StarCoder](https://github.com/bigcode-project/starcoder)
+- ✅ [OPT](https://huggingface.co/docs/transformers/model_doc/opt)
+
+<details>
+<summary>More Supported Models&nbsp</summary>
+
+- ✅ [Falcon](https://huggingface.co/docs/transformers/model_doc/falcon)
+- ✅ [InternLM2](https://huggingface.co/internlm)
+- ✅ [Mistral](https://huggingface.co/docs/transformers/model_doc/mistral)
+- ✅ [LLaMA V3](https://huggingface.co/meta-llama)
+- ✅ [Mixtral](https://huggingface.co/docs/transformers/model_doc/mixtral)
+- ✅ [Qwen V2](https://github.com/QwenLM/Qwen2)
+- ✅ [LLaVA](https://github.com/haotian-liu/LLaVA)
+- ✅ [InternLM2.5](https://huggingface.co/internlm)
+- ✅ [StableLM](https://github.com/Stability-AI/StableLM)
+- ✅ [Gemma2](https://huggingface.co/docs/transformers/main/en/model_doc/gemma2)
+- ✅ [Phi2](https://huggingface.co/microsoft/phi-2)
+- ✅ [Phi 1.5](https://huggingface.co/microsoft/phi-1_5)
+- ✅ [MiniCPM](https://github.com/OpenBMB/MiniCPM)
+- ✅ [SmolLM](https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966)
+- ✅ [DeepSeekv2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5)
+- ✅ [LLaMA V3.2 Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)
+- ✅ [Qwen MOE](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B)
+- ✅ [Qwen2-VL](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)
+- ✅ [InternVL2](https://huggingface.co/OpenGVLab/InternVL2-2B)
 
-✅ [Qwen MOE](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B)
-
-✅ [Qwen2-VL](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)
-
-✅ [InternVL2](https://huggingface.co/OpenGVLab/InternVL2-2B)
+</details>
 
 You can add your own model type referring to files under `llmc/models/*.py`.
 
-## Supported Backend List
-
-✅ [VLLM](https://github.com/vllm-project/vllm)
+## :bus: Supported Backend List
 
-✅ [LightLLM](https://github.com/ModelTC/lightllm)
+- ✅ [VLLM](https://github.com/vllm-project/vllm)
+- ✅ [LightLLM](https://github.com/ModelTC/lightllm)
+- ✅ [Sglang](https://github.com/sgl-project/sglang)
+- ✅ [MLC-LLM](https://github.com/mlc-ai/mlc-llm)
+- ✅ [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)
 
-✅ [Sglang](https://github.com/sgl-project/sglang)
-
-✅ [MLC-LLM](https://github.com/mlc-ai/mlc-llm)
-
-✅ [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)
-
-## Supported Algorithm List
+## 💡 Supported Algorithm List
 
 ### Quantization
 
-✅ Naive
-
-✅ [AWQ](https://arxiv.org/abs/2306.00978)
-
-✅ [GPTQ](https://arxiv.org/abs/2210.17323)
-
-✅ [SmoothQuant](https://arxiv.org/abs/2211.10438)
-
-✅ [OS+](https://arxiv.org/abs/2304.09145)
-
-✅ [OmniQuant](https://arxiv.org/abs/2308.13137)
-
-✅ [NormTweaking](https://arxiv.org/abs/2309.02784)
-
-✅ [AdaDim](https://arxiv.org/pdf/2309.15531.pdf)
-
-✅ [QUIK](https://arxiv.org/abs/2310.09259)
+- ✅ Naive
+- ✅ [AWQ](https://arxiv.org/abs/2306.00978)
+- ✅ [GPTQ](https://arxiv.org/abs/2210.17323)
+- ✅ [SmoothQuant](https://arxiv.org/abs/2211.10438)
+- ✅ [OS+](https://arxiv.org/abs/2304.09145)
+
+<details>
+<summary>More Supported Algorithms&nbsp</summary>
+
+- ✅ [OmniQuant](https://arxiv.org/abs/2308.13137)
+- ✅ [NormTweaking](https://arxiv.org/abs/2309.02784)
+- ✅ [AdaDim](https://arxiv.org/pdf/2309.15531.pdf)
+- ✅ [QUIK](https://arxiv.org/abs/2310.09259)
+- ✅ [SpQR](https://arxiv.org/abs/2306.03078)
+- ✅ [DGQ](https://arxiv.org/abs/2310.04836)
+- ✅ [OWQ](https://arxiv.org/abs/2306.02272)
+- ✅ [LLM.int8()](https://arxiv.org/abs/2208.07339)
+- ✅ [HQQ](https://mobiusml.github.io/hqq_blog/)
+- ✅ [QuaRot](https://arxiv.org/abs/2404.00456)
+- ✅ [SpinQuant](https://arxiv.org/abs/2405.16406) **([See this branch](https://github.com/ModelTC/llmc/tree/dev_spinquant))**
+- ✅ [TesseraQ](https://arxiv.org/abs/2410.19103)
 
-✅ [SpQR](https://arxiv.org/abs/2306.03078)
-
-✅ [DGQ](https://arxiv.org/abs/2310.04836)
-
-✅ [OWQ](https://arxiv.org/abs/2306.02272)
-
-✅ [LLM.int8()](https://arxiv.org/abs/2208.07339)
-
-✅ [HQQ](https://mobiusml.github.io/hqq_blog/)
-
-✅ [QuaRot](https://arxiv.org/abs/2404.00456)
-
-✅ [SpinQuant](https://arxiv.org/abs/2405.16406) **([See this branch](https://github.com/ModelTC/llmc/tree/dev_spinquant))**
-
-✅ [TesseraQ](https://arxiv.org/abs/2410.19103)
+</details>
 
 ### Pruning
 
-✅ Naive(Magnitude)
+- ✅ Naive(Magnitude)
+- ✅ [Wanda](https://arxiv.org/abs/2306.11695)
+- ✅ [ShortGPT](https://arxiv.org/abs/2403.03853)
 
-✅ [Wanda](https://arxiv.org/abs/2306.11695)
+## 🤝 Acknowledgments
 
-✅ [ShortGPT](https://arxiv.org/abs/2403.03853)
+We develop our code referring to the following repos:
 
-## Acknowledgments
+- [mit-han-lab/llm-awq](https://github.com/mit-han-lab/llm-awq)
+- [mit-han-lab/smoothquant](https://github.com/mit-han-lab/smoothquant)
+- [OpenGVLab/OmniQuant](https://github.com/OpenGVLab/OmniQuant)
+- [IST-DASLab/gptq](https://github.com/IST-DASLab/gptq)
+- [ModelTC/Outlier_Suppression_Plus](https://github.com/ModelTC/Outlier_Suppression_Plus)
+
+<details>
+<summary>More Related Implementations&nbsp</summary>
+
+- [IST-DASLab/QUIK](https://github.com/IST-DASLab/QUIK)
+- [Vahe1994/SpQR](https://github.com/Vahe1994/SpQR)
+- [ilur98/DGQ](https://github.com/ilur98/DGQ)
+- [xvyaward/owq](https://github.com/xvyaward/owq)
+- [TimDettmers/bitsandbytes](https://github.com/TimDettmers/bitsandbytes)
+- [mobiusml/hqq](https://github.com/mobiusml/hqq)
+- [spcl/QuaRot](https://github.com/spcl/QuaRot)
+- [locuslab/wanda](https://github.com/locuslab/wanda)
+- [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
+- [facebookresearch/SpinQuant](https://github.com/facebookresearch/SpinQuant)
+- [Intelligent-Computing-Lab-Yale/TesseraQ](https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ)
 
-We develop our code referring to the following repos:
+</details>
 
-- https://github.com/mit-han-lab/llm-awq
-- https://github.com/mit-han-lab/smoothquant
-- https://github.com/OpenGVLab/OmniQuant
-- https://github.com/IST-DASLab/gptq
-- https://github.com/ModelTC/Outlier_Suppression_Plus
-- https://github.com/IST-DASLab/QUIK
-- https://github.com/Vahe1994/SpQR
-- https://github.com/ilur98/DGQ
-- https://github.com/xvyaward/owq
-- https://github.com/TimDettmers/bitsandbytes
-- https://github.com/mobiusml/hqq
-- [https://github.com/spcl/QuaRot](https://github.com/spcl/QuaRot)
-- [https://github.com/locuslab/wanda](https://github.com/locuslab/wanda)
-- [https://github.com/EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
-- [https://github.com/facebookresearch/SpinQuant](https://github.com/facebookresearch/SpinQuant)
-- [https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ](https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ)
-
-## Star History
+## 🌟 Star History
 
 [![Star History Chart](https://api.star-history.com/svg?repos=ModelTC/llmc&type=Timeline)](https://star-history.com/#ModelTC/llmc&Timeline)
 
-## Citation
+## ✏️ Citation
 
-If you find our LLM-QBench paper/llmc toolkit useful or relevant to your research, please kindly cite our paper:
+If you find our toolkit or research paper useful or relevant to your research, please kindly cite our work:
 
 ```
-@misc{llmc,
-   author = {llmc contributors},
-   title = {llmc: Towards Accurate and Efficient LLM Compression},
-   year = {2024},
-   publisher = {GitHub},
-   journal = {GitHub repository},
-   howpublished = {\url{https://github.com/ModelTC/llmc}},
-}
-
-@misc{gong2024llmqbench,
-      title={LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models},
-      author={Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Yunchen Zhang and Xianglong Liu and Dacheng Tao},
-      year={2024},
-      eprint={2405.06001},
-      archivePrefix={arXiv},
-      primaryClass={cs.LG}
-}
-
-@misc{gong2024llmcbenchmarkinglargelanguage,
-      title={LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit},
-      author={Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Chentao Lv and Yunchen Zhang and Xianglong Liu and Dacheng Tao},
-      year={2024},
-      eprint={2405.06001},
-      archivePrefix={arXiv},
-      primaryClass={cs.LG},
-      url={https://arxiv.org/abs/2405.06001},
+@inproceedings{DBLP:conf/emnlp/GongYGHLZT024,
+  author={Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Chengtao Lv and Yunchen Zhang and Dacheng Tao and Xianglong Liu},
+  title={LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit},
+  year={2024},
+  cdate={1704067200000},
+  pages={132-152},
+  url={https://aclanthology.org/2024.emnlp-industry.12},
+  booktitle={EMNLP (Industry Track)},
+  crossref={conf/emnlp/2024i}
 }
 ```
diff --git a/README_ja.md b/README_ja.md
index 6dead79f1..064079c58 100644
--- a/README_ja.md
+++ b/README_ja.md
@@ -1,8 +1,7 @@
-# LLMC: 正確で効率的なLLM圧縮に向けて
+<div align="center" style="font-family: charter;">
+<h1> LLMC: 正確で効率的な LLM 圧縮に向けて </h1>
 
-<img src="./imgs/llmc.png" alt="llmc" style="zoom:35%;" />
-
-<div align="center">
+<img src="./imgs/llmc.png" alt="llmc" width="75%" />
 
 [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
 [![arXiv](https://img.shields.io/badge/LLMC-2405.06001-b31b1b)](https://arxiv.org/abs/2405.06001)
@@ -11,7 +10,7 @@
 [![Discord Banner](https://img.shields.io/discord/1139835312592392214?logo=discord&logoColor=white)](https://discord.com/invite/NfJzbkK3jY)
 [![QQ](https://img.shields.io/badge/QQ-EB1923?logo=tencent-qq&logoColor=white)](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592)
 [![Doc](https://img.shields.io/badge/docs-English-99cc2)](https://llmc-en.readthedocs.io/en/latest/)
-[![Doc](https://img.shields.io/badge/文档-中文-99cc2)](https://llmc-zhcn.readthedocs.io/en/latest/)
+[![Doc](https://img.shields.io/badge/ドキュメント-日本語-99cc2)](https://llmc-zhcn.readthedocs.io/en/latest/)&#160;
 
 **\[ [English](README.md) | [中文](README_zh.md) | 日本語 \]**
 
@@ -20,24 +19,18 @@
 **LLMC** は、大規模言語モデル（LLM）の圧縮を目的とした、最新の圧縮アルゴリズムを活用して、パフォーマンスを損なうことなく効率を向上させ、モデルサイズを削減するためのツールです。以下のコマンドを使用して、llmcを実行できるDockerイメージをダウンロードできます。中国大陸のユーザーは、阿里云Dockerを使用することを推奨します。
 
 ```shell
-# docker hub: https://hub.docker.com/r/llmcompression/llmc
+# Docker Hub: https://hub.docker.com/r/llmcompression/llmc
 docker pull llmcompression/llmc:pure-latest
 
-# 阿里云Docker: registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:[tag]
+# Aliyun Docker: registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:[tag]
 docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-latest
 ```
 
-**コミュニティ**:
-
-- [Discordサーバー](https://discord.com/invite/NfJzbkK3jY)
-- [Tencent QQグループ](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592)
+**コミュニティ**: [Discord サーバー](https://discord.com/invite/NfJzbkK3jY)、[Tencent QQ グループ](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592)。
 
-**ドキュメント**:
+**ドキュメント**: [English](https://llmc-en.readthedocs.io/en/latest/)、[中文](https://llmc-zhcn.readthedocs.io/en/latest/)。
 
-- [英語](https://llmc-en.readthedocs.io/en/latest/)
-- [中国語](https://llmc-zhcn.readthedocs.io/en/latest/)
-
-## 最新情報
+## :fire: 最新ニュース
 
 - **2025年5月12日：** 🔥 **`Wan2.1`** シリーズのビデオ生成モデルの量子化を完全にサポートし、実際に量子化された **INT8/FP8** 重みのエクスポートにも対応しました。これらは [lightx2v](https://github.com/ModelTC/lightx2v) 推論フレームワークと互換性があります。詳細は [lightx2v ドキュメント](https://llmc-en.readthedocs.io/en/latest/backend/lightx2v.html) をご参照ください。
 
@@ -49,6 +42,9 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
 - **2024年9月26日:** 🔥 `LLMC`からの✨ `FP8量子化（E4M3、E5M2）`モデルを、VLLMやSGLangのような高度な推理バックエンドにエクスポートできるようになりました。🚀 詳細な使用方法については、[VLLMのドキュメント](https://llmc-en.readthedocs.io/en/latest/backend/vllm.html)と[SGLangのドキュメント](https://llmc-en.readthedocs.io/en/latest/backend/sglang.html)を参照してください。
 
+<details close>
+<summary>以前のニュース</summary>
+
 - **2024年9月24日:** 🔥 私たちは正式に ✨`Llama-3.1-405B` の ✅INT4 と ✅INT8 モデルをリリースしました。これらは 🚀`LLMC` の `save_lightllm` モードを使用して量子化されています。モデルパラメータは[こちら](https://huggingface.co/Dongz/llama31-405b-quant)からダウンロードできます。
 
 - **2024年9月23日:** 🔥 私たちは、🚀`LLMC` から ✨`実際の量子化された(INT4, INT8)` モデルを、 [VLLM](https://github.com/vllm-project/vllm), [SGLang](https://github.com/sgl-project/sglang), [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), [MLC-LLM](https://github.com/mlc-ai/mlc-llm) などの高度な推論バックエンドにエクスポートするサポートを追加しました。これにより、✨`メモリ使用量の削減` と ✨`推論速度の向上` が可能になります。
@@ -70,9 +66,6 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
   (\*は同等の貢献を示し、📧は対応する著者を示します。)
 
-<details close>
-<summary>過去のニュース</summary>
-
 - **2024年7月16日:** 🔥私たちはLLMの疎化のためのWanda/Naive（マグニチュード）および層ごとの混合ビット量子化のサポートを追加しました！
 
 - **2024年7月14日:** 🔥私たちは回転ベースの量子化 QuaRot のサポートを追加しました！
@@ -97,7 +90,7 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
 </details>
 
-## 主要機能
+## 🚀 特徴
 
 - 💥**包括的なアルゴリズムサポート**: 広範な ✨`SOTA圧縮アルゴリズム` をサポートし、✅量子化、✅混合精度量子化、✅疎性を含み、元のリポジトリと同じ精度を維持します。✨`量子化ベストプラクティス`（ベストプラクティスは[こちら](https://llmc-en.readthedocs.io/en/latest/)をご覧ください）も提供されており、最適なパフォーマンスと効率を確保します。
 
@@ -109,175 +102,129 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
 - 💥**パフォーマンス効率**: ✨`Llama3.1-405B` や ✨`DeepSeek-R1-671B` などの大規模LLMの量子化をサポートし、`単一の A100/H100/H800 GPU` でPPL評価を可能にします。
 
-## 使用方法
+## ⚙️ 使い方
 
 使用ガイドは 🚀`Quick Start`セクション[こちら](https://llmc-en.readthedocs.io/en/latest/)をご覧ください。
 
-## サポートされているモデルリスト
-
-✅ [BLOOM](https://huggingface.co/bigscience/bloom)
-
-✅ [LLaMA](https://github.com/facebookresearch/llama)
-
-✅ [LLaMA V2](https://huggingface.co/meta-llama)
-
-✅ [StarCoder](https://github.com/bigcode-project/starcoder)
-
-✅ [OPT](https://huggingface.co/docs/transformers/model_doc/opt)
-
-✅ [Falcon](https://huggingface.co/docs/transformers/model_doc/falcon)
-
-✅ [InternLM2](https://huggingface.co/internlm)
-
-✅ [Mistral](https://huggingface.co/docs/transformers/model_doc/mistral)
-
-✅ [LLaMA V3](https://huggingface.co/meta-llama)
-
-✅ [Mixtral](https://huggingface.co/docs/transformers/model_doc/mixtral)
-
-✅ [Qwen V2](https://github.com/QwenLM/Qwen2)
-
-✅ [LLaVA](https://github.com/haotian-liu/LLaVA)
-
-✅ [InternLM2.5](https://huggingface.co/internlm)
-
-✅ [StableLM](https://github.com/Stability-AI/StableLM)
-
-✅ [Gemma2](https://huggingface.co/docs/transformers/main/en/model_doc/gemma2)
-
-✅ [Phi2](https://huggingface.co/microsoft/phi-2)
-
-✅ [Phi 1.5](https://huggingface.co/microsoft/phi-1_5)
-
-✅ [MiniCPM](https://github.com/OpenBMB/MiniCPM)
-
-✅ [SmolLM](https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966)
-
-✅ [DeepSeekv2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5)
-
-✅ [LLaMA V3.2 Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)
+## :robot: 対応モデル
+
+- ✅ [BLOOM](https://huggingface.co/bigscience/bloom)
+- ✅ [LLaMA](https://github.com/facebookresearch/llama)
+- ✅ [LLaMA V2](https://huggingface.co/meta-llama)
+- ✅ [StarCoder](https://github.com/bigcode-project/starcoder)
+- ✅ [OPT](https://huggingface.co/docs/transformers/model_doc/opt)
+
+<details>
+<summary>その他のモデル</summary>
+
+- ✅ [Falcon](https://huggingface.co/docs/transformers/model_doc/falcon)
+- ✅ [InternLM2](https://huggingface.co/internlm)
+- ✅ [Mistral](https://huggingface.co/docs/transformers/model_doc/mistral)
+- ✅ [LLaMA V3](https://huggingface.co/meta-llama)
+- ✅ [Mixtral](https://huggingface.co/docs/transformers/model_doc/mixtral)
+- ✅ [Qwen V2](https://github.com/QwenLM/Qwen2)
+- ✅ [LLaVA](https://github.com/haotian-liu/LLaVA)
+- ✅ [InternLM2.5](https://huggingface.co/internlm)
+- ✅ [StableLM](https://github.com/Stability-AI/StableLM)
+- ✅ [Gemma2](https://huggingface.co/docs/transformers/main/en/model_doc/gemma2)
+- ✅ [Phi2](https://huggingface.co/microsoft/phi-2)
+- ✅ [Phi 1.5](https://huggingface.co/microsoft/phi-1_5)
+- ✅ [MiniCPM](https://github.com/OpenBMB/MiniCPM)
+- ✅ [SmolLM](https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966)
+- ✅ [DeepSeekv2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5)
+- ✅ [LLaMA V3.2 Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)
+- ✅ [Qwen MOE](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B)
+- ✅ [Qwen2-VL](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)
+- ✅ [InternVL2](https://huggingface.co/OpenGVLab/InternVL2-2B)
 
-✅ [Qwen MOE](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B)
-
-✅ [Qwen2-VL](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)
-
-✅ [InternVL2](https://huggingface.co/OpenGVLab/InternVL2-2B)
-
-独自のモデルタイプを追加するには、`llmc/models/*.py` ファイルを参照してください。
-
-## サポートされているバックエンドリスト
-
-✅ [VLLM](https://github.com/vllm-project/vllm)
-
-✅ [LightLLM](https://github.com/ModelTC/lightllm)
+</details>
 
-✅ [Sglang](https://github.com/sgl-project/sglang)
+独自モデルを追加する場合は `llmc/models/*.py` を参照してください。
 
-✅ [MLC-LLM](https://github.com/mlc-ai/mlc-llm)
+## :bus: 対応バックエンド
 
-✅ [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)
+- ✅ [VLLM](https://github.com/vllm-project/vllm)
+- ✅ [LightLLM](https://github.com/ModelTC/lightllm)
+- ✅ [Sglang](https://github.com/sgl-project/sglang)
+- ✅ [MLC-LLM](https://github.com/mlc-ai/mlc-llm)
+- ✅ [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)
 
-## サポートされているアルゴリズムリスト
+## 💡 対応アルゴリズム
 
 ### 量子化
 
-✅ Naive
-
-✅ [AWQ](https://arxiv.org/abs/2306.00978)
-
-✅ [GPTQ](https://arxiv.org/abs/2210.17323)
-
-✅ [SmoothQuant](https://arxiv.org/abs/2211.10438)
-
-✅ [OS+](https://arxiv.org/abs/2304.09145)
-
-✅ [OmniQuant](https://arxiv.org/abs/2308.13137)
-
-✅ [NormTweaking](https://arxiv.org/abs/2309.02784)
-
-✅ [AdaDim](https://arxiv.org/pdf/2309.15531.pdf)
+- ✅ Naive
+- ✅ [AWQ](https://arxiv.org/abs/2306.00978)
+- ✅ [GPTQ](https://arxiv.org/abs/2210.17323)
+- ✅ [SmoothQuant](https://arxiv.org/abs/2211.10438)
+- ✅ [OS+](https://arxiv.org/abs/2304.09145)
+
+<details>
+<summary>その他のアルゴリズム</summary>
+
+- ✅ [OmniQuant](https://arxiv.org/abs/2308.13137)
+- ✅ [NormTweaking](https://arxiv.org/abs/2309.02784)
+- ✅ [AdaDim](https://arxiv.org/pdf/2309.15531.pdf)
+- ✅ [QUIK](https://arxiv.org/abs/2310.09259)
+- ✅ [SpQR](https://arxiv.org/abs/2306.03078)
+- ✅ [DGQ](https://arxiv.org/abs/2310.04836)
+- ✅ [OWQ](https://arxiv.org/abs/2306.02272)
+- ✅ [LLM.int8()](https://arxiv.org/abs/2208.07339)
+- ✅ [HQQ](https://mobiusml.github.io/hqq_blog/)
+- ✅ [QuaRot](https://arxiv.org/abs/2404.00456)
+- ✅ [SpinQuant](https://arxiv.org/abs/2405.16406) **([このブランチを参照してください](https://github.com/ModelTC/llmc/tree/dev_spinquant))**
+- ✅ [TesseraQ](https://arxiv.org/abs/2410.19103)
 
-✅ [QUIK](https://arxiv.org/abs/2310.09259)
-
-✅ [SpQR](https://arxiv.org/abs/2306.03078)
-
-✅ [DGQ](https://arxiv.org/abs/2310.04836)
-
-✅ [OWQ](https://arxiv.org/abs/2306.02272)
-
-✅ [LLM.int8()](https://arxiv.org/abs/2208.07339)
-
-✅ [HQQ](https://mobiusml.github.io/hqq_blog/)
-
-✅ [QuaRot](https://arxiv.org/abs/2404.00456)
-
-✅ [SpinQuant](https://arxiv.org/abs/2405.16406) **([このブランチを参照してください](https://github.com/ModelTC/llmc/tree/dev_spinquant))**
+</details>
 
-✅ [TesseraQ](https://arxiv.org/abs/2410.19103)
+### プルーニング
 
-### プルーニング（剪定）
+- ✅ Naive(Magnitude)
+- ✅ [Wanda](https://arxiv.org/abs/2306.11695)
+- ✅ [ShortGPT](https://arxiv.org/abs/2403.03853)
 
-✅ Naive（マグニチュード）
+## 🤝 謝辞
 
-✅ [Wanda](https://arxiv.org/abs/2306.11695)
+本プロジェクトは以下のリポジトリを参考にしています：
 
-✅ [ShortGPT](https://arxiv.org/abs/2403.03853)
+- [mit-han-lab/llm-awq](https://github.com/mit-han-lab/llm-awq)
+- [mit-han-lab/smoothquant](https://github.com/mit-han-lab/smoothquant)
+- [OpenGVLab/OmniQuant](https://github.com/OpenGVLab/OmniQuant)
+- [IST-DASLab/gptq](https://github.com/IST-DASLab/gptq)
+- [ModelTC/Outlier_Suppression_Plus](https://github.com/ModelTC/Outlier_Suppression_Plus)
 
-## 謝辞
+<details>
+<summary>その他の実装</summary>
 
-以下のリポジトリを参考にしてコードを開発しました：
+- [IST-DASLab/QUIK](https://github.com/IST-DASLab/QUIK)
+- [Vahe1994/SpQR](https://github.com/Vahe1994/SpQR)
+- [ilur98/DGQ](https://github.com/ilur98/DGQ)
+- [xvyaward/owq](https://github.com/xvyaward/owq)
+- [TimDettmers/bitsandbytes](https://github.com/TimDettmers/bitsandbytes)
+- [mobiusml/hqq](https://github.com/mobiusml/hqq)
+- [spcl/QuaRot](https://github.com/spcl/QuaRot)
+- [locuslab/wanda](https://github.com/locuslab/wanda)
+- [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
+- [facebookresearch/SpinQuant](https://github.com/facebookresearch/SpinQuant)
+- [Intelligent-Computing-Lab-Yale/TesseraQ](https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ)
 
-- https://github.com/mit-han-lab/llm-awq
-- https://github.com/mit-han-lab/smoothquant
-- https://github.com/OpenGVLab/OmniQuant
-- https://github.com/IST-DASLab/gptq
-- https://github.com/ModelTC/Outlier_Suppression_Plus
-- https://github.com/IST-DASLab/QUIK
-- https://github.com/Vahe1994/SpQR
-- https://github.com/ilur98/DGQ
-- https://github.com/xvyaward/owq
-- https://github.com/TimDettmers/bitsandbytes
-- https://github.com/mobiusml/hqq
-- [https://github.com/spcl/QuaRot](https://github.com/spcl/QuaRot)
-- [https://github.com/locuslab/wanda](https://github.com/locuslab/wanda)
-- [https://github.com/EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
-- [https://github.com/facebookresearch/SpinQuant](https://github.com/facebookresearch/SpinQuant)
-- [https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ](https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ)
+</details>
 
-## スター履歴
+## 🌟 Star 履歴
 
-[![スター履歴チャート](https://api.star-history.com/svg?repos=ModelTC/llmc&type=Timeline)](https://star-history.com/#ModelTC/llmc&Timeline)
+[![Star History Chart](https://api.star-history.com/svg?repos=ModelTC/llmc&type=Timeline)](https://star-history.com/#ModelTC/llmc&Timeline)
 
-## 引用
+## ✏️ 引用
 
-LLM-QBench論文/llmcツールキットが研究に役立つまたは関連している場合は、論文を引用してください：
+本ツールキットまたは論文が参考になった場合は、以下を引用してください：
 
 ```
-@misc{llmc,
-   author = {llmc contributors},
-   title = {llmc: Towards Accurate and Efficient LLM Compression},
-   year = {2024},
-   publisher = {GitHub},
-   journal = {GitHub repository},
-   howpublished = {\url{https://github.com/ModelTC/llmc}},
-}
-
-@misc{gong2024llmqbench,
-      title={LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models},
-      author={Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Yunchen Zhang and Xianglong Liu and Dacheng Tao},
-      year={2024},
-      eprint={2405.06001},
-      archivePrefix={arXiv},
-      primaryClass={cs.LG}
-}
-
-@misc{gong2024llmcbenchmarkinglargelanguage,
-      title={LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit},
-      author={Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Chentao Lv and Yunchen Zhang and Xianglong Liu and Dacheng Tao},
-      year={2024},
-      eprint={2405.06001},
-      archivePrefix={arXiv},
-      primaryClass={cs.LG},
-      url={https://arxiv.org/abs/2405.06001},
+@inproceedings{DBLP:conf/emnlp/GongYGHLZT024,
+  author    = {Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Chengtao Lv and Yunchen Zhang and Dacheng Tao and Xianglong Liu},
+  title     = {LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit},
+  booktitle = {EMNLP (Industry Track)},
+  year      = {2024},
+  pages     = {132--152},
+  url       = {https://aclanthology.org/2024.emnlp-industry.12}
 }
 ```
diff --git a/README_zh.md b/README_zh.md
index ae2b3e5f6..9699fe275 100644
--- a/README_zh.md
+++ b/README_zh.md
@@ -1,8 +1,7 @@
-# LLMC: 准确高效的LLM压缩工具
+<div align="center" style="font-family: charter;">
+<h1> LLMC：迈向准确且高效的大语言模型压缩 </h1>
 
-<img src="./imgs/llmc.png" alt="llmc" style="zoom:35%;" />
-
-<div align="center">
+<img src="./imgs/llmc.png" alt="llmc" width="75%" />
 
 [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
 [![arXiv](https://img.shields.io/badge/LLMC-2405.06001-b31b1b)](https://arxiv.org/abs/2405.06001)
@@ -11,7 +10,7 @@
 [![Discord Banner](https://img.shields.io/discord/1139835312592392214?logo=discord&logoColor=white)](https://discord.com/invite/NfJzbkK3jY)
 [![QQ](https://img.shields.io/badge/QQ-EB1923?logo=tencent-qq&logoColor=white)](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592)
 [![Doc](https://img.shields.io/badge/docs-English-99cc2)](https://llmc-en.readthedocs.io/en/latest/)
-[![Doc](https://img.shields.io/badge/文档-中文-99cc2)](https://llmc-zhcn.readthedocs.io/en/latest/)
+[![Doc](https://img.shields.io/badge/文档-中文-99cc2)](https://llmc-zhcn.readthedocs.io/en/latest/)&#160;
 
 **\[ [English](README.md) | 中文 | [日本語](README_ja.md) \]**
 
@@ -20,24 +19,18 @@
 **LLMC** 是一个开箱即用的工具，专为压缩LLM设计，利用最先进的压缩算法提高效率并减少模型体积，同时不影响预测精度。你可以通过以下命令下载可以运行llmc的docker镜像，中国大陆用户推荐使用阿里云docker。
 
 ```shell
-# docker hub: https://hub.docker.com/r/llmcompression/llmc
+# Docker Hub: https://hub.docker.com/r/llmcompression/llmc
 docker pull llmcompression/llmc:pure-latest
 
-# 阿里云docker: registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:[tag]
+# 阿里云镜像: registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:[tag]
 docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-latest
 ```
 
-**社区**:
-
-- [Discord 服务器](https://discord.com/invite/NfJzbkK3jY)
-- [腾讯QQ群](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592)
+**社区**： [Discord 服务器](https://discord.com/invite/NfJzbkK3jY)、[腾讯 QQ 群](http://qm.qq.com/cgi-bin/qm/qr?_wv=1027&k=I9IGPWWj8uuRXWH3_ELWjouf6gkIMgUl&authKey=GA3WbFAsm90ePJf%2FCbc7ZyXXq4ShQktlBaLxgqS5yuSPAsr3%2BDKMRdosUiLYoilO&noverify=0&group_code=526192592)。
 
-**文档**:
+**文档**： [English](https://llmc-en.readthedocs.io/en/latest/)、[中文](https://llmc-zhcn.readthedocs.io/en/latest/)。
 
-- [英文](https://llmc-en.readthedocs.io/en/latest/)
-- [中文](https://llmc-zhcn.readthedocs.io/en/latest/)
-
-## 最新消息
+## :fire: 最新动态
 
 - **2025年5月12日：** 🔥 我们现已全面支持 **`Wan2.1`** 系列视频生成模型的量化，并支持导出真实量化的 **INT8/FP8** 权重，兼容 [lightx2v](https://github.com/ModelTC/lightx2v) 推理框架。详情请参考 [lightx2v 使用文档](https://llmc-zhcn.readthedocs.io/en/latest/backend/lightx2v.html)。
 
@@ -49,6 +42,9 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
 - **2024年9月26日:** 🔥 我们现在支持从🚀 `LLMC`导出💥 `FP8 量化（E4M3，E5M2）`模型到一些先进的推理后端，例如[VLLM](https://github.com/vllm-project/vllm)和[SGLang](https://github.com/sgl-project/sglang)。关于详细使用方法，请参阅[VLLM文档](https://llmc-zhcn.readthedocs.io/en/latest/backend/vllm.html)和[SGLang文档](https://llmc-zhcn.readthedocs.io/en/latest/backend/sglang.html)。
 
+<details close>
+<summary>更早动态</summary>
+
 - **2024年9月24日:** 🔥 我们正式发布了 ✨`Llama-3.1-405B` 的 ✅INT4 和 ✅INT8 模型，这些模型通过 🚀`LLMC` 使用 `save_lightllm` 模式进行量化。你可以在[此处](https://huggingface.co/Dongz/llama31-405b-quant)下载模型参数。
 
 - **2024年9月23日:** 🔥 我们现在支持从 🚀`LLMC` 导出 ✨`真正量化的(INT4, INT8)` 模型到先进推理后端，例如 [VLLM](https://github.com/vllm-project/vllm), [SGLang](https://github.com/sgl-project/sglang), [AutoAWQ](https://github.com/casper-hansen/AutoAWQ), 和 [MLC-LLM](https://github.com/mlc-ai/mlc-llm) 用于量化推理部署，从而实现 ✨`减少内存使用` 和 ✨`加快推理速度`。
@@ -70,9 +66,6 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
   (\* 表示同等贡献，📧 表示通讯作者。)
 
-<details close>
-<summary>历史消息</summary>
-
 - **2024年7月16日:** 🔥我们现在支持 Wanda/Naive（幅度）进行 LLM 稀疏化和逐层混合比特量化！
 
 - **2024年7月14日:** 🔥我们现在支持基于旋转的量化 QuaRot！
@@ -97,7 +90,7 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
 </details>
 
-## 亮点功能
+## 🚀 亮点功能
 
 - 💥**综合算法支持**: 提供广泛的 ✨`SOTA压缩算法` 支持，包括 ✅量化、✅混合精度量化 和 ✅稀疏化，同时保持与原始仓库一致的精度。我们还提供 ✨`量化最佳实践`（参见✨`最佳实践` 章节[此处](https://llmc-zhcn.readthedocs.io/en/latest/)），确保最佳性能和效率。
 
@@ -109,177 +102,129 @@ docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-lates
 
 - 💥**性能效率**: 支持大规模LLM的量化，例如 ✨`Llama3.1-405B` 和 ✨`DeepSeek-R1-671B`，并可在 `单个 A100/H100/H800 GPU` 上评估 PPL。
 
-## 使用指南
+## ⚙️ 快速上手
 
 请参阅 🚀`快速入门`章节[此处](https://llmc-zhcn.readthedocs.io/en/latest/)。
 
-## 支持的模型列表
-
-✅ [BLOOM](https://huggingface.co/bigscience/bloom)
-
-✅ [LLaMA](https://github.com/facebookresearch/llama)
-
-✅ [LLaMA V2](https://huggingface.co/meta-llama)
-
-✅ [StarCoder](https://github.com/bigcode-project/starcoder)
-
-✅ [OPT](https://huggingface.co/docs/transformers/model_doc/opt)
-
-✅ [Falcon](https://huggingface.co/docs/transformers/model_doc/falcon)
-
-✅ [InternLM2](https://huggingface.co/internlm)
-
-✅ [Mistral](https://huggingface.co/docs/transformers/model_doc/mistral)
-
-✅ [LLaMA V3](https://huggingface.co/meta-llama)
-
-✅ [Mixtral](https://huggingface.co/docs/transformers/model_doc/mixtral)
-
-✅ [Qwen V2](https://github.com/QwenLM/Qwen2)
-
-✅ [LLaVA](https://github.com/haotian-liu/LLaVA)
-
-✅ [InternLM2.5](https://huggingface.co/internlm)
-
-✅ [StableLM](https://github.com/Stability-AI/StableLM)
-
-✅ [Gemma2](https://huggingface.co/docs/transformers/main/en/model_doc/gemma2)
-
-✅ [Phi2](https://huggingface.co/microsoft/phi-2)
-
-✅ [Phi 1.5](https://huggingface.co/microsoft/phi-1_5)
-
-✅ [MiniCPM](https://github.com/OpenBMB/MiniCPM)
-
-✅ [SmolLM](https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966)
-
-✅ [DeepSeekv2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5)
-
-✅ [LLaMA V3.2 Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)
-
-✅ [Qwen MOE](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B)
+## :robot: 支持的模型
+
+- ✅ [BLOOM](https://huggingface.co/bigscience/bloom)
+- ✅ [LLaMA](https://github.com/facebookresearch/llama)
+- ✅ [LLaMA V2](https://huggingface.co/meta-llama)
+- ✅ [StarCoder](https://github.com/bigcode-project/starcoder)
+- ✅ [OPT](https://huggingface.co/docs/transformers/model_doc/opt)
+
+<details>
+<summary>更多模型</summary>
+
+- ✅ [Falcon](https://huggingface.co/docs/transformers/model_doc/falcon)
+- ✅ [InternLM2](https://huggingface.co/internlm)
+- ✅ [Mistral](https://huggingface.co/docs/transformers/model_doc/mistral)
+- ✅ [LLaMA V3](https://huggingface.co/meta-llama)
+- ✅ [Mixtral](https://huggingface.co/docs/transformers/model_doc/mixtral)
+- ✅ [Qwen V2](https://github.com/QwenLM/Qwen2)
+- ✅ [LLaVA](https://github.com/haotian-liu/LLaVA)
+- ✅ [InternLM2.5](https://huggingface.co/internlm)
+- ✅ [StableLM](https://github.com/Stability-AI/StableLM)
+- ✅ [Gemma2](https://huggingface.co/docs/transformers/main/en/model_doc/gemma2)
+- ✅ [Phi2](https://huggingface.co/microsoft/phi-2)
+- ✅ [Phi 1.5](https://huggingface.co/microsoft/phi-1_5)
+- ✅ [MiniCPM](https://github.com/OpenBMB/MiniCPM)
+- ✅ [SmolLM](https://huggingface.co/collections/HuggingFaceTB/smollm-6695016cad7167254ce15966)
+- ✅ [DeepSeekv2.5](https://huggingface.co/deepseek-ai/DeepSeek-V2.5)
+- ✅ [LLaMA V3.2 Vision](https://huggingface.co/meta-llama/Llama-3.2-11B-Vision)
+- ✅ [Qwen MOE](https://huggingface.co/Qwen/Qwen1.5-MoE-A2.7B)
+- ✅ [Qwen2-VL](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)
+- ✅ [InternVL2](https://huggingface.co/OpenGVLab/InternVL2-2B)
 
-✅ [Qwen2-VL](https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct)
-
-✅ [InternVL2](https://huggingface.co/OpenGVLab/InternVL2-2B)
-
-你可以参考 `llmc/models/*.py` 文件添加自己的模型类型。
-
-## 支持的后端列表
-
-✅ [VLLM](https://github.com/vllm-project/vllm)
-
-✅ [LightLLM](https://github.com/ModelTC/lightllm)
+</details>
 
-✅ [Sglang](https://github.com/sgl-project/sglang)
+您可参考 `llmc/models/*.py` 添加自定义模型。
 
-✅ [MLC-LLM](https://github.com/mlc-ai/mlc-llm)
+## :bus: 支持的后端
 
-✅ [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)
+- ✅ [VLLM](https://github.com/vllm-project/vllm)
+- ✅ [LightLLM](https://github.com/ModelTC/lightllm)
+- ✅ [Sglang](https://github.com/sgl-project/sglang)
+- ✅ [MLC-LLM](https://github.com/mlc-ai/mlc-llm)
+- ✅ [AutoAWQ](https://github.com/casper-hansen/AutoAWQ)
 
-## 支持的算法列表
+## 💡 支持的算法
 
 ### 量化
 
-✅ Naive
-
-✅ [AWQ](https://arxiv.org/abs/2306.00978)
-
-✅ [GPTQ](https://arxiv.org/abs/2210.17323)
-
-✅ [SmoothQuant](https://arxiv.org/abs/2211.10438)
-
-✅ [OS+](https://arxiv.org/abs/2304.09145)
-
-✅ [OmniQuant](https://arxiv.org/abs/2308.13137)
-
-✅ [NormTweaking](https://arxiv.org/abs/2309.02784)
-
-✅ [AdaDim](https://arxiv.org/pdf/2309.15531.pdf)
-
-✅ [QUIK](https://arxiv.org/abs/2310.09259)
-
-✅ [SpQR](https://arxiv.org/abs/2306.03078)
+- ✅ Naive
+- ✅ [AWQ](https://arxiv.org/abs/2306.00978)
+- ✅ [GPTQ](https://arxiv.org/abs/2210.17323)
+- ✅ [SmoothQuant](https://arxiv.org/abs/2211.10438)
+- ✅ [OS+](https://arxiv.org/abs/2304.09145)
+
+<details>
+<summary>更多算法</summary>
+
+- ✅ [OmniQuant](https://arxiv.org/abs/2308.13137)
+- ✅ [NormTweaking](https://arxiv.org/abs/2309.02784)
+- ✅ [AdaDim](https://arxiv.org/pdf/2309.15531.pdf)
+- ✅ [QUIK](https://arxiv.org/abs/2310.09259)
+- ✅ [SpQR](https://arxiv.org/abs/2306.03078)
+- ✅ [DGQ](https://arxiv.org/abs/2310.04836)
+- ✅ [OWQ](https://arxiv.org/abs/2306.02272)
+- ✅ [LLM.int8()](https://arxiv.org/abs/2208.07339)
+- ✅ [HQQ](https://mobiusml.github.io/hqq_blog/)
+- ✅ [QuaRot](https://arxiv.org/abs/2404.00456)
+- ✅ [SpinQuant](https://arxiv.org/abs/2405.16406) **([见此分支](https://github.com/ModelTC/llmc/tree/dev_spinquant))**
+- ✅ [TesseraQ](https://arxiv.org/abs/2410.19103)
 
-✅ [DGQ](https://arxiv.org/abs/2310.04836)
-
-✅ [OWQ](https://arxiv.org/abs/2306.02272)
-
-✅ [LLM.int8()](https://arxiv.org/abs/2208.07339)
-
-✅ [HQQ](https://mobiusml.github.io/hqq_blog/)
-
-✅ [QuaRot](https://arxiv.org/abs/2404.00456)
-
-✅ [SpinQuant](https://arxiv.org/abs/2405.16406) **([见此分支](https://github.com/ModelTC/llmc/tree/dev_spinquant))**
-
-✅ [TesseraQ](https://arxiv.org/abs/2410.19103)
+</details>
 
 ### 剪枝
 
-✅ Naive（Magnitude）
+- ✅ Naive(Magnitude)
+- ✅ [Wanda](https://arxiv.org/abs/2306.11695)
+- ✅ [ShortGPT](https://arxiv.org/abs/2403.03853)
 
-✅ [Wanda](https://arxiv.org/abs/2306.11695)
+## 🤝 致谢
 
-✅ [ShortGPT](https://arxiv.org/abs/2403.03853)
+本项目参考了以下仓库：
 
-## 鸣谢
+- [mit-han-lab/llm-awq](https://github.com/mit-han-lab/llm-awq)
+- [mit-han-lab/smoothquant](https://github.com/mit-han-lab/smoothquant)
+- [OpenGVLab/OmniQuant](https://github.com/OpenGVLab/OmniQuant)
+- [IST-DASLab/gptq](https://github.com/IST-DASLab/gptq)
+- [ModelTC/Outlier_Suppression_Plus](https://github.com/ModelTC/Outlier_Suppression_Plus)
 
-我们的代码参考了以下仓库：
+<details>
+<summary>更多相关实现</summary>
 
-- https://github.com/mit-han-lab/llm-awq
-- https://github.com/mit-han-lab/smoothquant
-- https://github.com/OpenGVLab/OmniQuant
-- https://github.com/IST-DASLab/gptq
-- https://github.com/ModelTC/Outlier_Suppression_Plus
-- https://github.com/IST-DASLab/QUIK
-- https://github.com/Vahe1994/SpQR
-- https://github.com/ilur98/DGQ
-- https://github.com/xvyaward/owq
-- https://github.com/TimDettmers/bitsandbytes
-- https://github.com/mobiusml/hqq
-- [https://github.com/spcl/QuaRot](https://github.com/spcl/QuaRot)
-- [https://github.com/locuslab/wanda](https://github.com/locuslab/wanda)
-- [https://github.com/EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
-- [https://github.com/facebookresearch/SpinQuant](https://github.com/facebookresearch/SpinQuant)
-- [https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ](https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ)
+- [IST-DASLab/QUIK](https://github.com/IST-DASLab/QUIK)
+- [Vahe1994/SpQR](https://github.com/Vahe1994/SpQR)
+- [ilur98/DGQ](https://github.com/ilur98/DGQ)
+- [xvyaward/owq](https://github.com/xvyaward/owq)
+- [TimDettmers/bitsandbytes](https://github.com/TimDettmers/bitsandbytes)
+- [mobiusml/hqq](https://github.com/mobiusml/hqq)
+- [spcl/QuaRot](https://github.com/spcl/QuaRot)
+- [locuslab/wanda](https://github.com/locuslab/wanda)
+- [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness)
+- [facebookresearch/SpinQuant](https://github.com/facebookresearch/SpinQuant)
+- [Intelligent-Computing-Lab-Yale/TesseraQ](https://github.com/Intelligent-Computing-Lab-Yale/TesseraQ)
 
-## Star 历史
+</details>
 
-[![Star History Chart](https://api.star-history.com/svg?repos=ModelTC/llmc&type=Timeline)](https://star-history.com/#ModelTC/llmc&Timeline)
+## 🌟 Star 历史
 
-## 引用
+[![Star History Chart](https://api.star-history.com/svg?repos=ModelTC/llmc&type=Timeline)](https://star-history.com/#ModelTC/llmc&Timeline)
 
-## 引用
+## ✏️ 引用
 
-如果您认为我们的 LLM-QBench 论文/llmc 工具对您的研究有用或相关，请务必引用我们的论文：
+如果您觉得本工具包或相关论文对您的研究有帮助，请引用：
 
 ```
-@misc{llmc,
-   author = {llmc contributors},
-   title = {llmc: Towards Accurate and Efficient LLM Compression},
-   year = {2024},
-   publisher = {GitHub},
-   journal = {GitHub repository},
-   howpublished = {\url{https://github.com/ModelTC/llmc}},
-}
-
-@misc{gong2024llmqbench,
-      title={LLM-QBench: A Benchmark Towards the Best Practice for Post-training Quantization of Large Language Models},
-      author={Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Yunchen Zhang and Xianglong Liu and Dacheng Tao},
-      year={2024},
-      eprint={2405.06001},
-      archivePrefix={arXiv},
-      primaryClass={cs.LG}
-}
-
-@misc{gong2024llmcbenchmarkinglargelanguage,
-      title={LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit},
-      author={Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Chentao Lv and Yunchen Zhang and Xianglong Liu and Dacheng Tao},
-      year={2024},
-      eprint={2405.06001},
-      archivePrefix={arXiv},
-      primaryClass={cs.LG},
-      url={https://arxiv.org/abs/2405.06001},
+@inproceedings{DBLP:conf/emnlp/GongYGHLZT024,
+  author    = {Ruihao Gong and Yang Yong and Shiqiao Gu and Yushi Huang and Chengtao Lv and Yunchen Zhang and Dacheng Tao and Xianglong Liu},
+  title     = {LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit},
+  booktitle = {EMNLP (Industry Track)},
+  year      = {2024},
+  pages     = {132--152},
+  url       = {https://aclanthology.org/2024.emnlp-industry.12}
 }
 ```
diff --git a/docs/en/source/conf.py b/docs/en/source/conf.py
index 879d62c58..7e78c0fb8 100644
--- a/docs/en/source/conf.py
+++ b/docs/en/source/conf.py
@@ -1,17 +1,26 @@
 # Configuration file for the Sphinx documentation builder.
 #
-# For the full list of built-in configuration values, see the documentation:
-# https://www.sphinx-doc.org/en/master/usage/configuration.html
+# This file adopts the theme and basic settings used by the Lightx2v docs
+# but keeps the llmc-specific information from the original configuration.
+# -----------------------------------------------------------------------------
 
-# -- Project information -----------------------------------------------------
-# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+import os
+import sys
+from typing import List
+
+# -- Path setup --------------------------------------------------------------
+# Add project root (two levels up) so autodoc can find the modules.
+ROOT_DIR = os.path.abspath(os.path.join(__file__, "../../.."))
+sys.path.append(ROOT_DIR)
 
+# -- Project information -----------------------------------------------------
 project = "llmc"
 copyright = "2024, llmc contributors"
 author = "ModelTC"
 release = "1.0.0"
 
-github_url = f"https://github.com/ModelTC/llmc"
+# GitHub repository ----------------------------------------------------------
+github_url = "https://github.com/ModelTC/llmc"
 
 html_context = {
     "display_github": True,
@@ -20,50 +29,86 @@
     "github_version": "main",
     "conf_py_path": "/docs/en/source/",  # Path in the checkout to the docs root
 }
-html_theme_options = {
-    "github_url": github_url,
-    "doc_items": {
-        "paper": "https://arxiv.org/abs/2405.06001",
-        "institution": "https://github.com/ModelTC",
-    },
-    "logo": "images/logo/llmc.svg",
-    "logo_dark": "images/logo/llmc.svg",
-    "logo_icon": "images/logo/llmc.svg",
-}
-
 
 # -- General configuration ---------------------------------------------------
-# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
 
 extensions = [
-    "myst_parser",
-    "sphinx.ext.autodoc",
-    "sphinx.ext.viewcode",
     "sphinx.ext.napoleon",
-    "sphinxcontrib.contentui",
+    "sphinx.ext.viewcode",
+    "sphinx.ext.intersphinx",
+    "sphinx.ext.autodoc",
+    "sphinx.ext.autosummary",
+    "myst_parser",
+    "sphinx_copybutton",
     "sphinx.ext.doctest",
     "sphinx.ext.mathjax",
     "sphinx.ext.ifconfig",
-    "sphinx-prompt",
-    "sphinxcontrib.jquery",
-    "sphinx.ext.autosectionlabel",
     "sphinx.ext.githubpages",
-    "sphinx.ext.intersphinx",
+    "sphinx.ext.autosectionlabel",
     "sphinxcontrib.katex",
-    "sphinx_copybutton",
+    "sphinxcontrib.contentui",
 ]
 
-templates_path = ["_templates"]
-exclude_patterns = []
+templates_path: List[str] = ["_templates"]
+exclude_patterns: List[str] = []
 
 language = "en"
 
+# Exclude the prompt "$" when copying code blocks --------------------------
+copybutton_prompt_text = r"\$ "
+copybutton_prompt_is_regexp = True
+
 # -- Options for HTML output -------------------------------------------------
-# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+html_title = project
+html_theme = "sphinx_book_theme"
+html_logo = "images/logo/llmc.svg"
+html_static_path = ["_static"]
+
+# Theme options compatible with sphinx_book_theme / pydata-sphinx-theme
+html_theme_options = {
+    "path_to_docs": "docs/en/source",
+    "repository_url": github_url,
+    "use_repository_button": True,
+    "logo": {
+        "text": "LLMC",
+        "image_light": "images/logo/llmc.svg",
+        "image_dark": "images/logo/llmc.svg",
+    },
+    "doc_items": {
+        "paper": "https://arxiv.org/abs/2405.06001",
+        "institution": "https://github.com/ModelTC",
+    },
+}
+
+# -- Intersphinx mapping (optional) -----------------------------------------
+intersphinx_mapping = {
+    "python": ("https://docs.python.org/3", {}),
+    "sphinx": ("https://www.sphinx-doc.org/en/master", {}),
+}
 
+# -- Mock heavy external dependencies ---------------------------------------
+autodoc_mock_imports = [
+    "torch",
+    "transformers",
+    "sentencepiece",
+    "tensorizer",
+]
 
-html_theme = "trojanzoo_sphinx_theme"
+# Remove base-class note in generated docs ----------------------------------
+from sphinx.ext import autodoc  # noqa: E402, isort: skip
 
-html_static_path = ["_static"]
+class MockedClassDocumenter(autodoc.ClassDocumenter):
+    """Remove note about base class when a class is derived from object."""
+
+    def add_line(self, line: str, source: str, *lineno: int) -> None:
+        if line == "   Bases: :py:class:`object`":
+            return
+        super().add_line(line, source, *lineno)
+
+autodoc.ClassDocumenter = MockedClassDocumenter
+
+# -- Customisation hooks -----------------------------------------------------
 
-source_suffix = [".rst", ".md"]
+def setup(app):
+    """Optional Sphinx setup hooks."""
+    pass
diff --git a/docs/zh_cn/source/conf.py b/docs/zh_cn/source/conf.py
index 9b1ae0785..f6ef270a4 100644
--- a/docs/zh_cn/source/conf.py
+++ b/docs/zh_cn/source/conf.py
@@ -1,69 +1,110 @@
-# Configuration file for the Sphinx documentation builder.
-#
-# For the full list of built-in configuration values, see the documentation:
-# https://www.sphinx-doc.org/en/master/usage/configuration.html
+# Configuration file for the Sphinx documentation builder (中文文档).
+# -----------------------------------------------------------------------------
+# 参考 Lightx2v 样式，把原先 trojanzoo_sphinx_theme 改为 sphinx_book_theme，
+# 并修正 logo 配置格式。
 
-# -- Project information -----------------------------------------------------
-# https://www.sphinx-doc.org/en/master/usage/configuration.html#project-information
+import os
+import sys
+from typing import List
 
+# -- Path setup --------------------------------------------------------------
+ROOT_DIR = os.path.abspath(os.path.join(__file__, "../../.."))
+sys.path.append(ROOT_DIR)
+
+# -- 项目信息 ---------------------------------------------------------------
 project = "llmc"
 copyright = "2024, llmc contributors"
 author = "ModelTC"
 release = "1.0.0"
 
-github_url = f"https://github.com/ModelTC/llmc"
+# GitHub 信息 ---------------------------------------------------------------
+github_url = "https://github.com/ModelTC/llmc"
 
 html_context = {
     "display_github": True,
     "github_user": author,
     "github_repo": "llmc",
     "github_version": "main",
-    "conf_py_path": "/docs/zh_cn/source/",  # Path in the checkout to the docs root
-}
-html_theme_options = {
-    "github_url": github_url,
-    "doc_items": {
-        "paper": "https://arxiv.org/abs/2405.06001",
-        "institution": "https://github.com/ModelTC",
-    },
-    "logo": "images/logo/llmc.svg",
-    "logo_dark": "images/logo/llmc.svg",
-    "logo_icon": "images/logo/llmc.svg",
+    "conf_py_path": "/docs/zh_cn/source/",  # 文档根路径
 }
 
-
-# -- General configuration ---------------------------------------------------
-# https://www.sphinx-doc.org/en/master/usage/configuration.html#general-configuration
-
+# -- 通用配置 ----------------------------------------------------------------
 extensions = [
-    "myst_parser",
-    "sphinx.ext.autodoc",
-    "sphinx.ext.viewcode",
     "sphinx.ext.napoleon",
-    "sphinxcontrib.contentui",
+    "sphinx.ext.viewcode",
+    "sphinx.ext.intersphinx",
+    "sphinx.ext.autodoc",
+    "sphinx.ext.autosummary",
+    "myst_parser",
+    "sphinx_copybutton",
     "sphinx.ext.doctest",
     "sphinx.ext.mathjax",
     "sphinx.ext.ifconfig",
-    "sphinx-prompt",
-    "sphinxcontrib.jquery",
-    "sphinx.ext.autosectionlabel",
     "sphinx.ext.githubpages",
-    "sphinx.ext.intersphinx",
+    "sphinx.ext.autosectionlabel",
     "sphinxcontrib.katex",
-    "sphinx_copybutton",
+    "sphinxcontrib.contentui",
 ]
 
-templates_path = ["_templates"]
-exclude_patterns = []
+templates_path: List[str] = ["_templates"]
+exclude_patterns: List[str] = []
+
+language = "zh_CN"
+
+# 复制代码块时去除shell提示符 ---------------------------------------------
+copybutton_prompt_text = r"\$ "
+copybutton_prompt_is_regexp = True
+
+# -- HTML 输出选项 -----------------------------------------------------------
+html_title = project
+html_theme = "sphinx_book_theme"
+html_logo = "images/logo/llmc.svg"
+html_static_path = ["_static"]
+
+html_theme_options = {
+    "path_to_docs": "docs/zh_cn/source",
+    "repository_url": github_url,
+    "use_repository_button": True,
+    "logo": {
+        "text": "LLMC",
+        "image_light": "images/logo/llmc.svg",
+        "image_dark": "images/logo/llmc.svg",
+    },
+    "doc_items": {
+        "paper": "https://arxiv.org/abs/2405.06001",
+        "institution": "https://github.com/ModelTC",
+    },
+}
+
+# -- Intersphinx -------------------------------------------------------------
+intersphinx_mapping = {
+    "python": ("https://docs.python.org/3", {}),
+    "sphinx": ("https://www.sphinx-doc.org/en/master", {}),
+}
 
-language = "cn"
+# -- Mock 外部依赖 -----------------------------------------------------------
+autodoc_mock_imports = [
+    "torch",
+    "transformers",
+    "sentencepiece",
+    "tensorizer",
+]
 
-# -- Options for HTML output -------------------------------------------------
-# https://www.sphinx-doc.org/en/master/usage/configuration.html#options-for-html-output
+# -- 自定义处理 -------------------------------------------------------------
+from sphinx.ext import autodoc  # noqa: E402, isort: skip
 
+class MockedClassDocumenter(autodoc.ClassDocumenter):
+    """移除“Bases: object”行。"""
 
-html_theme = "trojanzoo_sphinx_theme"
+    def add_line(self, line: str, source: str, *lineno: int) -> None:
+        if line == "   Bases: :py:class:`object`":
+            return
+        super().add_line(line, source, *lineno)
 
-html_static_path = ["_static"]
+autodoc.ClassDocumenter = MockedClassDocumenter
+
+# -- 额外钩子 ---------------------------------------------------------------
 
-source_suffix = [".rst", ".md"]
+def setup(app):
+    """可选的 Sphinx setup。"""
+    pass
diff --git a/requirements/docs.txt b/requirements/docs.txt
index a15a4fc07..1c8eec42f 100644
--- a/requirements/docs.txt
+++ b/requirements/docs.txt
@@ -1,15 +1,7 @@
-docutils
-modelindex
-myst-parser
-sphinx
-sphinx-copybutton
-sphinx-design
-sphinx-notfound-page
-sphinx-tabs
-sphinxcontrib-jquery
-tabulate
-sphinxcontrib.contentui
--e git+https://github.com/ain-soph/trojanzoo_sphinx_theme.git#egg=trojanzoo_sphinx_theme
-sphinx-prompt
-sphinxcontrib-katex
-sphinx-copybutton
+sphinx == 6.2.1
+sphinx-book-theme == 1.0.1
+sphinx-copybutton == 0.5.2
+myst-parser == 2.0.0
+sphinx-argparse
+sphinxcontrib.redoc
+sphinxcontrib.openapi