-
Notifications
You must be signed in to change notification settings - Fork 65
Refactor module replacement #1153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
Signed-off-by: yiliu30 <yi4.liu@intel.com>
|
|
||
| from auto_round.utils import LazyImport, logger | ||
|
|
||
| BUILTIN_MODULES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @WeiweiZhang1 , Please be aware of this change: for any newly added built‑in replacement module, we need to 1) implement the new module by inheriting from ReplacementModuleBase, and 2) add it to BUILTIN_MODULES manually.
|
LGTM |
|
If it's not targetd to 0.9.3, then please hold on the merge |
It’s not targeting 0.9.3, but we’d like to merge it this week. |
| self.act_fn = nn.GELU() | ||
|
|
||
| def forward(self, x): | ||
| gate = self.new_gate_proj(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember that there is a warning to remind user that transformers could run these models anymore.
In this pr, this has been supported. Have we deleted this warning? please have a double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m not aware of the background, could you point me to the code? thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find the warnings. Better run one model and double check the log if there are some warning like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did not see such suspicious warnings.
2025-12-23 05:58:48 WARNING modeling_utils.py L4793: `torch_dtype` is deprecated! Use `dtype` instead!
Converting model: 0it [00:00, ?it/s]2025-12-23 05:58:49 WARNING configuration_utils.py L430: `torch_dtype` is deprecated! Use `dtype` instead!
2025-12-23 05:58:50 WARNING mx.py L159: MXFP quantization is still in experimental stage, the inference speed might be slow.
2025-12-23 05:59:08 WARNING utils.py L2077: The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
(ao) yliu7@mlp-dgx-01:examples$
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the warning has been removed in this PR: #1067
| for name, module in model.named_modules(): | ||
| class_name = ( | ||
| module.original_module_class() | ||
| if hasattr(module, "original_module_class") and callable(module.original_module_class) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the module has attrbute original_module_class, the module is already replaced. Why do we need to replace it again?
resolve #899