Skip to content

Conversation

@yiliu30
Copy link
Contributor

@yiliu30 yiliu30 commented Dec 17, 2025

resolve #899

  • Replace GPT-OSS/LLAMA
  • UT

Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
@yiliu30 yiliu30 requested a review from wenhuach21 December 19, 2025 03:57
Signed-off-by: yiliu30 <yi4.liu@intel.com>
@yiliu30 yiliu30 requested a review from wenhuach21 December 19, 2025 09:23
@yiliu30 yiliu30 requested a review from WeiweiZhang1 December 23, 2025 03:42
Signed-off-by: yiliu30 <yi4.liu@intel.com>

from auto_round.utils import LazyImport, logger

BUILTIN_MODULES = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @WeiweiZhang1 , Please be aware of this change: for any newly added built‑in replacement module, we need to 1) implement the new module by inheriting from ReplacementModuleBase, and 2) add it to BUILTIN_MODULES manually.

@n1ck-guo
Copy link
Contributor

LGTM

@wenhuach21
Copy link
Contributor

If it's not targetd to 0.9.3, then please hold on the merge

@yiliu30
Copy link
Contributor Author

yiliu30 commented Dec 23, 2025

If it's not targetd to 0.9.3, then please hold on the merge

It’s not targeting 0.9.3, but we’d like to merge it this week.
Since we already have an RC branch for 0.9.3, would it be okay to merge this into main? cc @chensuyue @thuang6

@yiliu30 yiliu30 added the ready only add when the PR is ready to merge label Dec 23, 2025
self.act_fn = nn.GELU()

def forward(self, x):
gate = self.new_gate_proj(x)
Copy link
Contributor

@wenhuach21 wenhuach21 Dec 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that there is a warning to remind user that transformers could run these models anymore.
In this pr, this has been supported. Have we deleted this warning? please have a double check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not aware of the background, could you point me to the code? thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find the warnings. Better run one model and double check the log if there are some warning like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not see such suspicious warnings.

2025-12-23 05:58:48 WARNING modeling_utils.py L4793: `torch_dtype` is deprecated! Use `dtype` instead!
Converting model: 0it [00:00, ?it/s]2025-12-23 05:58:49 WARNING configuration_utils.py L430: `torch_dtype` is deprecated! Use `dtype` instead!
2025-12-23 05:58:50 WARNING mx.py L159: MXFP quantization is still in experimental stage, the inference speed might be slow.
2025-12-23 05:59:08 WARNING utils.py L2077: The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
(ao) yliu7@mlp-dgx-01:examples$ 

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the warning has been removed in this PR: #1067

@yiliu30 yiliu30 removed the ready only add when the PR is ready to merge label Dec 23, 2025
@yiliu30 yiliu30 added the WIP label Dec 23, 2025
for name, module in model.named_modules():
class_name = (
module.original_module_class()
if hasattr(module, "original_module_class") and callable(module.original_module_class)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the module has attrbute original_module_class, the module is already replaced. Why do we need to replace it again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Refactor modelling replacement code

5 participants