-
Notifications
You must be signed in to change notification settings - Fork 310
Adds get_quantization_layer_structure hooks for GPTQ #2462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds get_quantization_layer_structure hooks for GPTQ #2462
Conversation
Summary of ChangesHello @JyotinderSingh, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a significant architectural improvement by refactoring how KerasHub models expose their internal layer structure, specifically for quantization. By implementing Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces get_quantization_layer_structure hooks to various models, which is a good step towards decoupling quantization logic from the core library and improving maintainability. The implementations for specific models like Gemma and GPT-2 correctly encapsulate model-specific pre-processing logic.
I've left a couple of minor suggestions to improve code quality:
- One comment addresses code duplication between
CausalLMandMaskedLMfor better long-term maintainability. - Another comment points out an unnecessary local import that can be removed to adhere to standard Python style conventions.
The added tests are thorough and correctly validate the new functionality. Overall, this is a solid contribution.
divyashreepathihalli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description of the change
This change shifts the logic for identifying a KerasHub model's layer structure (like Embedding layers and Transformer Blocks) to the
keras-hublayer itself. This is achieved by implementingget_quantization_layer_structurehooks.The original setup placed this layer discovery logic within the
kerascore library. This created a dependency inversion issue, requiringkerasto know the internal implementation details of models inkeras-hub. This made the code difficult to maintain, prone to becoming outdated, and harder to understand.Reference
keras-team/keras#21894
Checklist