Refactor module replacement #1153

yiliu30 · 2025-12-17T07:19:12Z

resolve #899

Replace GPT-OSS/LLAMA
UT

Signed-off-by: yiliu30 <yi4.liu@intel.com>

auto_round/inference/convert_model.py

auto_round/modelling/__init__.py

Signed-off-by: yiliu30 <yi4.liu@intel.com>

auto_round/inference/convert_model.py

auto_round/modelling/__init__.py

Signed-off-by: yiliu30 <yi4.liu@intel.com>

auto_round/modelling/replace_modules.py

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 · 2025-12-23T05:42:00Z

auto_round/modelling/replace_modules.py

+
+from auto_round.utils import LazyImport, logger
+
+BUILTIN_MODULES = {


Hi @WeiweiZhang1 , Please be aware of this change: for any newly added built‑in replacement module, we need to 1) implement the new module by inheriting from ReplacementModuleBase, and 2) add it to BUILTIN_MODULES manually.

n1ck-guo · 2025-12-23T05:43:31Z

LGTM

wenhuach21 · 2025-12-23T05:44:32Z

If it's not targetd to 0.9.3, then please hold on the merge

yiliu30 · 2025-12-23T06:35:15Z

If it's not targetd to 0.9.3, then please hold on the merge

It’s not targeting 0.9.3, but we’d like to merge it this week.
Since we already have an RC branch for 0.9.3, would it be okay to merge this into main? cc @chensuyue @thuang6

wenhuach21 · 2025-12-23T07:14:54Z

test/test_cuda/test_moe_model.py

+        self.act_fn = nn.GELU()
+
+    def forward(self, x):
+        gate = self.new_gate_proj(x)


I remember that there is a warning to remind user that transformers could run these models anymore.
In this pr, this has been supported. Have we deleted this warning? please have a double check

I’m not aware of the background, could you point me to the code? thanks

I couldn't find the warnings. Better run one model and double check the log if there are some warning like this.

did not see such suspicious warnings.

2025-12-23 05:58:48 WARNING modeling_utils.py L4793: `torch_dtype` is deprecated! Use `dtype` instead! Converting model: 0it [00:00, ?it/s]2025-12-23 05:58:49 WARNING configuration_utils.py L430: `torch_dtype` is deprecated! Use `dtype` instead! 2025-12-23 05:58:50 WARNING mx.py L159: MXFP quantization is still in experimental stage, the inference speed might be slow. 2025-12-23 05:59:08 WARNING utils.py L2077: The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. (ao) yliu7@mlp-dgx-01:examples$

the warning has been removed in this PR: #1067

mengniwang95 · 2025-12-24T02:11:45Z

auto_round/modelling/replace_modules.py

+    for name, module in model.named_modules():
+        class_name = (
+            module.original_module_class()
+            if hasattr(module, "original_module_class") and callable(module.original_module_class)


if the module has attrbute original_module_class, the module is already replaced. Why do we need to replace it again?

yiliu30 added 17 commits November 10, 2025 22:19

refine gpt oss

e01d955

Signed-off-by: yiliu30 <yi4.liu@intel.com>

clean

48eb4af

Signed-off-by: yiliu30 <yi4.liu@intel.com>

fix

a4a4118

Signed-off-by: yiliu30 <yi4.liu@intel.com>

rename

06cb071

Signed-off-by: yiliu30 <yi4.liu@intel.com>

fix

fbc8d81

Signed-off-by: yiliu30 <yi4.liu@intel.com>

add base module

bcb0d72

Signed-off-by: yiliu30 <yi4.liu@intel.com>

fix

e3f2527

Signed-off-by: yiliu30 <yi4.liu@intel.com>

tmp fix

2924806

Signed-off-by: yiliu30 <yi4.liu@intel.com>

debug

721570d

Signed-off-by: yiliu30 <yi4.liu@intel.com>

refactor

49f1232

Signed-off-by: yiliu30 <yi4.liu@intel.com>

update

b4dea58

Signed-off-by: yiliu30 <yi4.liu@intel.com>

refactor llama

dfd1ab1

Signed-off-by: yiliu30 <yi4.liu@intel.com>

clean

0fd05f1

Signed-off-by: yiliu30 <yi4.liu@intel.com>

revert

8a89f37

Signed-off-by: yiliu30 <yi4.liu@intel.com>

correct hints

2910017

Signed-off-by: yiliu30 <yi4.liu@intel.com>

fix

2b3c234

Signed-off-by: yiliu30 <yi4.liu@intel.com>

correct docstring

3cadd7b

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 requested review from mengniwang95, n1ck-guo and wenhuach21 December 17, 2025 07:19

wenhuach21 reviewed Dec 17, 2025

View reviewed changes

auto_round/inference/convert_model.py Outdated Show resolved Hide resolved

auto_round/modelling/__init__.py Outdated Show resolved Hide resolved

yiliu30 added 2 commits December 18, 2025 21:52

rename

9d81ce3

Signed-off-by: yiliu30 <yi4.liu@intel.com>

refine

8ac27e1

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 requested a review from wenhuach21 December 19, 2025 03:57

wenhuach21 reviewed Dec 19, 2025

View reviewed changes

auto_round/inference/convert_model.py Outdated Show resolved Hide resolved

wenhuach21 reviewed Dec 19, 2025

View reviewed changes

auto_round/modelling/__init__.py Outdated Show resolved Hide resolved

use lazy import

26d7a2a

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 requested a review from wenhuach21 December 19, 2025 09:23

Merge branch 'main' into refactor-replace

749cbf7

yiliu30 commented Dec 22, 2025

View reviewed changes

auto_round/modelling/replace_modules.py Show resolved Hide resolved

wenhuach21 reviewed Dec 22, 2025

View reviewed changes

auto_round/modelling/replace_modules.py Show resolved Hide resolved

wenhuach21 approved these changes Dec 22, 2025

View reviewed changes

Merge branch 'main' into refactor-replace

d1a1dcb

yiliu30 requested a review from WeiweiZhang1 December 23, 2025 03:42

add ut

cd65945

Signed-off-by: yiliu30 <yi4.liu@intel.com>

yiliu30 commented Dec 23, 2025

View reviewed changes

n1ck-guo approved these changes Dec 23, 2025

View reviewed changes

yiliu30 added the ready only add when the PR is ready to merge label Dec 23, 2025

wenhuach21 reviewed Dec 23, 2025

View reviewed changes

Merge branch 'main' into refactor-replace

65633e6

yiliu30 removed the ready only add when the PR is ready to merge label Dec 23, 2025

Merge branch 'main' into refactor-replace

9b507be

yiliu30 added the WIP label Dec 23, 2025

Merge branch 'main' into refactor-replace

b3d4bb7

mengniwang95 reviewed Dec 24, 2025

View reviewed changes


		from auto_round.utils import LazyImport, logger

		BUILTIN_MODULES = {

Refactor module replacement #1153

Are you sure you want to change the base?

Refactor module replacement #1153

Uh oh!

Conversation

yiliu30 commented Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

yiliu30 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

n1ck-guo commented Dec 23, 2025

Uh oh!

wenhuach21 commented Dec 23, 2025

Uh oh!

yiliu30 commented Dec 23, 2025

Uh oh!

wenhuach21 Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiliu30 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

wenhuach21 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

yiliu30 Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

mengniwang95 Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

mengniwang95 Dec 24, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yiliu30 commented Dec 17, 2025 •

edited

Loading

wenhuach21 Dec 23, 2025 •

edited

Loading