FIX: Allow KNN with missing values #2822

david-cortes-intel · 2025-12-03T09:33:48Z

Description

KNN algorithms on sklearn support data with missing values, but sklearnex makes a check for 'all finite'. This PR makes it fall back to sklearn when receiving data that has missing values.

Disclaimer: I'm not 100% sure about whether oneDAL supports KNN with missing values or not.

Checklist:

Completeness and readability

Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.

david-cortes-intel · 2025-12-03T09:51:46Z

/intelci: run

david-cortes-intel · 2025-12-03T09:52:23Z

/azp run Nightly

azure-pipelines · 2025-12-03T09:52:33Z

Azure Pipelines successfully started running 1 pipeline(s).

david-cortes-intel · 2025-12-03T17:12:20Z

/intelci: run

david-cortes-intel · 2025-12-03T17:12:27Z

/azp run Nightly

azure-pipelines · 2025-12-03T17:12:37Z

Azure Pipelines successfully started running 1 pipeline(s).

codecov · 2025-12-03T17:41:19Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag	Coverage Δ
azure	`80.42% <ø> (-0.01%)`	⬇️
github	`82.02% <ø> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sklearnex/neighbors/common.py	`92.44% <ø> (-0.05%)`	⬇️

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

sklearnex/neighbors/common.py

icfaust · 2025-12-04T10:25:02Z

scikit-learn/scikit-learn#25319 @david-cortes-intel . Can you detail the missing value use case with an example? As of now we support these metrics: https://github.com/uxlfoundation/scikit-learn-intelex/blob/main/sklearnex/neighbors/common.py#L221 which does not include nan_euclidean which would be the use case of this missing value support.

david-cortes-intel · 2025-12-04T10:40:42Z

scikit-learn/scikit-learn#25319 @david-cortes-intel . Can you detail the missing value use case with an example? As of now we support these metrics: https://github.com/uxlfoundation/scikit-learn-intelex/blob/main/sklearnex/neighbors/common.py#L221 which does not include nan_euclidean which would be the use case of this missing value support.

It still fails this test:
https://github.com/scikit-learn/scikit-learn/blob/4a10d0ed8d85e6ed24a647bd28a65c0c64b101ef/sklearn/neighbors/tests/test_neighbors.py#L2356

icfaust · 2025-12-04T10:43:35Z

@david-cortes-intel then I think there is a bug in the metrics check. I think that should be the avenue taken to fix the issue.

icfaust · 2025-12-04T10:46:22Z

Just wanted to note that I know that its a bit convoluted and annoying on the neighbors side of things here, I just want to make sure that if we can solve it without a separate finite check, then we should do that.

david-cortes-intel · 2025-12-04T11:09:52Z

@david-cortes-intel then I think there is a bug in the metrics check. I think that should be the avenue taken to fix the issue.

Switched it to use the metric checks instead.

david-cortes-intel · 2025-12-04T13:44:19Z

/intelci: run

david-cortes-intel requested a review from Alexandr-Solovev December 3, 2025 09:33

david-cortes-intel requested review from Vika-F, ethanglaser, icfaust, maria-Petrova, syakov-intel and yuejiaointel as code owners December 3, 2025 09:33

david-cortes-intel added the sklearn-patch sklearn patching label Dec 3, 2025

icfaust reviewed Dec 3, 2025

View reviewed changes

sklearnex/neighbors/common.py Outdated Show resolved Hide resolved

better fix without validating data twice

937b209

david-cortes-intel force-pushed the knn_nan branch from 236d53e to 937b209 Compare December 4, 2025 11:08

fix

a4f4014

ethanglaser approved these changes Dec 5, 2025

View reviewed changes

FIX: Allow KNN with missing values #2822

Are you sure you want to change the base?

FIX: Allow KNN with missing values #2822

Conversation

david-cortes-intel commented Dec 3, 2025

Description

Uh oh!

david-cortes-intel commented Dec 3, 2025

Uh oh!

david-cortes-intel commented Dec 3, 2025

Uh oh!

azure-pipelines bot commented Dec 3, 2025

Uh oh!

david-cortes-intel commented Dec 3, 2025

Uh oh!

david-cortes-intel commented Dec 3, 2025

Uh oh!

azure-pipelines bot commented Dec 3, 2025

Uh oh!

codecov bot commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

icfaust commented Dec 4, 2025

Uh oh!

david-cortes-intel commented Dec 4, 2025

Uh oh!

icfaust commented Dec 4, 2025

Uh oh!

icfaust commented Dec 4, 2025

Uh oh!

david-cortes-intel commented Dec 4, 2025

Uh oh!

david-cortes-intel commented Dec 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 3, 2025 •

edited

Loading