FIX: Fix KNN predictions on `X=None` #2821

david-cortes-intel · 2025-12-03T08:44:13Z

Description

Sklearn allows passing X=None to all prediction-related methods of KNN estimators, but sklearnex takes that as an all-nan data, which is not the same. This PR fixes it.

Checklist:

Completeness and readability

Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.

david-cortes-intel · 2025-12-03T08:44:24Z

/intelci: run

david-cortes-intel · 2025-12-03T08:49:41Z

/azp run Nightly

azure-pipelines · 2025-12-03T08:49:50Z

Azure Pipelines successfully started running 1 pipeline(s).

codecov · 2025-12-03T09:10:57Z

Codecov Report

❌ Patch coverage is 62.50000% with 3 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
sklearnex/neighbors/knn_classification.py	50.00%	0 Missing and 3 partials ⚠️

Flag	Coverage Δ
azure	`80.41% <62.50%> (-0.02%)`	⬇️
github	`82.00% <62.50%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
onedal/neighbors/neighbors.py	`80.05% <100.00%> (-0.28%)`	⬇️
sklearnex/neighbors/common.py	`92.52% <100.00%> (+0.04%)`	⬆️
sklearnex/neighbors/knn_classification.py	`94.59% <50.00%> (-4.00%)`	⬇️

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

icfaust · 2025-12-03T20:41:15Z

sklearnex/neighbors/common.py

+                        data[0] is None
+                        and method_name in ["predict", "predict_proba", "score"]
+                    ),
+                    "Predictions on 'None' data are handled by internal sklearn methods.",


Shouldn't this instead look for the _fit_X attribute if X is None? I'm not entirely sure I follow the solution of this PR (I do see the problem however).

No, that wouldn't give the same results. The sklearn docs mention:

In this case, the query point is not considered its own neighbor
https://github.com/scikit-learn/scikit-learn/blob/4a10d0ed8d85e6ed24a647bd28a65c0c64b101ef/sklearn/neighbors/_base.py#L767C13-L767C77

And one can see that forwarding the same fit data doesn't generate the same results:

import numpy as np from sklearn.neighbors import KNeighborsClassifier rng = np.random.default_rng(seed=123) X = rng.standard_normal(size=(5,3)) y = rng.integers(2, size=X.shape[0]) knn = KNeighborsClassifier(n_neighbors=2).fit(X, y) knn.predict_proba(None)

array([[0.5, 0.5], [1. , 0. ], [0. , 1. ], [0.5, 0.5], [0.5, 0.5]])

knn.predict_proba(X)

array([[1. , 0. ], [0.5, 0.5], [0.5, 0.5], [0.5, 0.5], [0. , 1. ]])

But this is why we have the following logic: https://github.com/uxlfoundation/scikit-learn-intelex/blob/main/onedal/neighbors/neighbors.py#L297-L306 Just to full block it is a loss of functionality seems a bit heavy handed. At a minimum the explanation text for debugging needs to be made clearer, it needs to be discussed with someone with KNN knowledge on oneDAL side about this case @ethanglaser ?

If I'm understanding the logic correctly, that would either way produce incorrect results in cases where there are observations with the same 'X' features but different 'y'.

Actually no, turns out that sklearn suffers from the same problem.

Either way, even if this ends up offloading the operation to sklearn, since the class overrides the method kneighbors which sklearn ends up calling, the code from onedal.neighbors ends up being used instead of sklearn's.

I've now modified it to use the sklearnex methods for predict_proba so that the log entry would be reflective of what happens, but it looks like there's no implementations of predict and score for X=None on either the onedal or sklearnex side, so we'll still need to fall back on those, unless they were to be reimplemented on the python side by calling kneighbors internally like sklearn does for theirs. Note that it still ends up calling oneDAL since it overrides the methods from the sklearn class that are called internally.

david-cortes-intel · 2025-12-04T09:53:54Z

/intelci: run

fix predictions on X=None

233a9f3

david-cortes-intel requested review from Vika-F, ethanglaser and icfaust as code owners December 3, 2025 08:44

david-cortes-intel added the bug Something isn't working label Dec 3, 2025

icfaust reviewed Dec 3, 2025

View reviewed changes

david-cortes-intel added 2 commits December 4, 2025 05:10

don't fall back on predict_proba

ac5e52c

add comment about misleading log

5eb1674

FIX: Fix KNN predictions on X=None #2821

Are you sure you want to change the base?

FIX: Fix KNN predictions on X=None #2821

Uh oh!

Conversation

david-cortes-intel commented Dec 3, 2025

Description

Uh oh!

david-cortes-intel commented Dec 3, 2025

Uh oh!

david-cortes-intel commented Dec 3, 2025

Uh oh!

azure-pipelines bot commented Dec 3, 2025

Uh oh!

codecov bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

icfaust Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

icfaust Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel Dec 4, 2025

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

FIX: Fix KNN predictions on `X=None` #2821

FIX: Fix KNN predictions on `X=None` #2821

codecov bot commented Dec 3, 2025 •

edited

Loading

icfaust Dec 3, 2025 •

edited

Loading

icfaust Dec 4, 2025 •

edited

Loading

david-cortes-intel Dec 4, 2025 •

edited

Loading