[JAX] Add support for per-layer config custimization and composable configs by bkowalskiINTEL · Pull Request #2481 · intel/neural-compressor

bkowalskiINTEL · 2026-05-27T15:10:27Z

No description provided.

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

for more information, see https://pre-commit.ci

Copilot

Pull request overview

Adds JAX quantization support for (1) per-layer filtering via include/exclude patterns and (2) composing multiple quantization configs, while refactoring quantize flow to centralize model wrapping and improve (de)serialization behavior.

Changes:

Add include / exclude layer filters to StaticQuantConfig and DynamicQuantConfig, and apply them when generating model info.
Add ComposableConfig support in quantization config JSON serialization/deserialization and in quantize_model() mapping construction.
Refactor JAX static/dynamic algorithms to operate on per-layer config mappings and move wrapper application into quantize_model().

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`test/jax/test_config_on_vit.py`	Adds a ViT config-filter demonstration (currently executes at import time).
`neural_compressor/jax/quantization/saving.py`	Adds ComposableConfig (de)serialization and updates deserialization preparation to handle multiple sub-configs and include/exclude.
`neural_compressor/jax/quantization/quantize.py`	Adds ComposableConfig mapping merge logic, enforces static-first algorithm application, and centralizes model wrapping.
`neural_compressor/jax/quantization/config.py`	Introduces include/exclude filtering and switches model info to use layer paths (with filtering).
`neural_compressor/jax/algorithms/static.py`	Updates static quantization to prepare only layers selected by configs_mapping and to support per-layer params.
`neural_compressor/jax/algorithms/dynamic.py`	Updates dynamic quantization to prepare only layers selected by configs_mapping and to support per-layer params.
`examples/jax/keras/gemma/quantization.py`	Updates Gemma quantization example (currently contains an early `exit()`).

+    def _matches(pattern: str) -> bool:
+        if pattern == class_name:
+            return True
+        return re.search(pattern, layer_id) is not None


+    # Build configs_mapping - handle ComposableConfig by calling sub-configs individually
+    if isinstance(quant_config, ComposableConfig):
+        configs_mapping = _build_configs_mapping_composable(model, quant_config)
+    else:
+        model_info = quant_config.get_model_info(model)
+        configs_mapping = quant_config.to_config_mapping(model_info=model_info)


anko-intel

Some comments from offline reviewe

anko-intel · 2026-06-02T11:17:55Z


    Returns:
-        keras.Model: The quantized model wrapped for inference.
+        keras.Model: The quantized model.


Previous comment seems to be more accurate

anko-intel · 2026-06-02T12:32:12Z

+        causal_lm_make_replace_generate_function(model)
+
+    # Execute algorithms - static first to ensure calibration runs on original FP32 model
+    algo_order = sorted(algos_mapping.keys(), key=lambda name: (0 if name == STATIC_QUANT else 1))


can we use priority or something like that. I am not sure, but I think I saw such list of algos

anko-intel · 2026-06-02T12:39:03Z

-    iterate_over_layers(qmodel, operations, filter_function=lambda c: c in static_quant_mapping)
+    # Phase 1: Prepare layers and add observers
+    for layer in qmodel._flatten_layers():
+        if layer.__class__ not in static_quant_mapping:


maybe we can filter if class could be quantied in earlier stage

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Co-authored-by: Bartosz Kowalski <bartosz.kowalski@intel.com> Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Fixes missing import Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 9 comments.

+    if quant_type == "composable":
+        sub_configs = [quant_config_from_json_object(cfg) for cfg in json_obj["configs"]]
+        result = sub_configs[0]
+        for cfg in sub_configs[1:]:
+            result = result + cfg
+        return result


+    def _matches(pattern: str) -> bool:
+        if pattern == class_name:
+            return True
+        return re.search(pattern, layer_id) is not None


+        white_list = self.white_list
+        if white_list is None:
+            white_list = []
+        elif white_list == DEFAULT_WHITE_LIST:
+            from neural_compressor.jax.quantization.layers_dynamic import dynamic_quant_mapping
+
+            white_list = [layer_class.__name__ for layer_class in dynamic_quant_mapping.keys()]
        filter_result = []


+        white_list = self.white_list
+        if white_list is None:
+            white_list = []
+        elif white_list == DEFAULT_WHITE_LIST:
+            from neural_compressor.jax.quantization.layers_static import static_quant_mapping
+
+            white_list = [layer_class.__name__ for layer_class in static_quant_mapping.keys()]
        filter_result = []


+            include (Optional[List[str]]): List of layer class names or path patterns to include.
+                When set, only matching layers are quantized. Supports fnmatch patterns.
+            exclude (Optional[List[str]]): List of layer class names or path patterns to exclude.
+                Matching layers are skipped. Supports fnmatch patterns.


-        lambda layer: layer.add_observers(),
-    ]
-    iterate_over_layers(qmodel, operations, filter_function=lambda c: c in static_quant_mapping)
    calib_function(qmodel)


+    # Build configs_mapping - handle ComposableConfig by calling sub-configs individually
+    if isinstance(quant_config, ComposableConfig):
+        configs_mapping = _build_configs_mapping_composable(model, quant_config)
+    else:
+        model_info = quant_config.get_model_info(model)


Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

bkowalskiINTEL and others added 9 commits May 27, 2026 08:08

add include/exclude filtering and path-based get_model_info

aed7004

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

refactor algorithms to use per-layer config from configs_mapping

09c989c

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

support ComposableConfig and centralize model wrapping

3b1aa05

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

handle ComposableConfig serialization and per-layer deserialization

24ba943

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

add ViT config filtering integration test

bc8d9a6

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

update gemma example for composable config API

6b1f2b1

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

simplify deserialization to use class-based layer matching

8835c1c

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Fix configs merging

306286e

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

0f17bea

for more information, see https://pre-commit.ci

anko-intel reviewed May 29, 2026

View reviewed changes

Comment thread examples/jax/keras/gemma/quantization.py Outdated

anko-intel requested a review from Copilot June 2, 2026 09:07

Copilot started reviewing on behalf of anko-intel June 2, 2026 09:07 View session

Copilot AI reviewed Jun 2, 2026

View reviewed changes

anko-intel reviewed Jun 2, 2026

View reviewed changes

bkowalskiINTEL added 2 commits June 3, 2026 06:33

Align dynamic algo to static algo and fix white list bug

c3cdd3f

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Merge branch 'master' into dev/bkowalsk/jax_composable_configs

f8b4ad3

bkowalskiINTEL commented Jun 3, 2026

View reviewed changes

Comment thread neural_compressor/jax/algorithms/dynamic.py Outdated

Remove debug stuff commited by accident

c8e0698

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

bkowalskiINTEL commented Jun 3, 2026

View reviewed changes

Comment thread neural_compressor/jax/algorithms/static.py Outdated

bkowalskiINTEL commented Jun 3, 2026

View reviewed changes

Comment thread neural_compressor/jax/algorithms/dynamic.py Outdated

bkowalskiINTEL commented Jun 3, 2026

View reviewed changes

Comment thread neural_compressor/jax/quantization/quantize.py Outdated

bkowalskiINTEL force-pushed the dev/bkowalsk/jax_composable_configs branch from 2c557c7 to c45031b Compare June 3, 2026 14:03

bkowalskiINTEL added 6 commits June 3, 2026 07:03

Apply suggestion from @bkowalskiINTEL

56176ef

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Apply suggestions from code review

690bb51

Co-authored-by: Bartosz Kowalski <bartosz.kowalski@intel.com> Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Apply suggestion from @bkowalskiINTEL

eb4c879

Fixes missing import Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Remove redundant op support checks in static/dynamic algorithms

c45031b

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Modify docstring for static_quantize

f4119e7

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Comments cleanup

5301022

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

bkowalskiINTEL requested review from anko-intel and Copilot June 3, 2026 14:13

Copilot started reviewing on behalf of bkowalskiINTEL June 3, 2026 14:13 View session

Restore gemma example

c79ef1e

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Copilot AI reviewed Jun 3, 2026

View reviewed changes

bkowalskiINTEL marked this pull request as ready for review June 3, 2026 15:02

bkowalskiINTEL and others added 4 commits June 3, 2026 17:03

Fix comments

f9bac3f

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Fix comments

d768b7b

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Apply suggestion from code review

368d95e

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Fix and refactor keras and keras_hub layers registrations

185cdde

Signed-off-by: Bartosz Kowalski <bartosz.kowalski@intel.com>

Conversation

bkowalskiINTEL commented May 27, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anko-intel left a comment

Choose a reason for hiding this comment

Uh oh!

anko-intel Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

anko-intel Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

anko-intel Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants