GH-22081: [Python] Support reading numpy arrays with ndims > 1 into pa arrays of Lists #48581

NickCrews · 2025-12-18T06:27:06Z

Rationale for this change

Fixes #22081

Efficiently/correctly creating List arrays from numpy arrays with ndims > 1.

What changes are included in this PR?

Before, pa.array(np.arange(6).reshape(2,3)) would fail. Now it returns an array of length 2, where each element is a size-3 list.

I think this is the only intuitive behavior. But if you can think of an alternative behavior a user might want/expect from this, then please let's talk about it.

I am not super familiar with numpy/pyarrow memory layout internals to understand if there are other cases besides the C-continuous memory layout where we could use zero-copy. But even if there are other cases, I'm not sure if we need to bother with them, I bet the c-continuous covers 95% of usage.

I also am not sure if this is a good way to to do this, or if there is a more succinct way.

This was written entirely by GH copilot. You can see my dialog with copilot as I tweaked it's directions and chose an implementation in NickCrews#3

Perhaps this logic should be pulled into its own _from_n_dim_numpy(np_arr) helper function to keep the larger control flow of the function more clear, let me know if you think so.

Are these changes tested?

Yes, I think adequately. It doesn't actually verify that the zero-copy path is used, just that the results are correct. I didn't really want to deal with messing with monkeypatching/spying on things to detect the 0-copy, but can add this if we want to verify.

We also just compare the results to the result via the .tolist() path, but perhaps we should instead write out the actual expected value as boilerplate so that it is even more obvious what the expected behavior is.

Are there any user-facing changes?

No breaking changes, only newly supported features!

GitHub Issue: [Python] Support inferring nested ndarray with ndim > 1 #22081

Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

… arrays Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

github-actions · 2025-12-18T06:27:31Z

⚠️ GitHub issue #22081 has been automatically assigned in GitHub to PR creator.

Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

NickCrews · 2025-12-18T06:40:55Z

I marked as WIP until the tests actually pass. sigh, time to babysit AI again...

Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

Copilot AI and others added 5 commits December 17, 2025 22:24

Initial plan

2044e99

Add support for multidimensional numpy arrays in pa.array()

68f8583

Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

Add documentation for multidimensional numpy array support

46c098c

Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

Address code review feedback: fix formatting and remove placeholder

ab2e1ac

Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

Implement efficient recursive FixedSizeListArray for multidimensional…

52afba2

… arrays Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

NickCrews requested review from AlenkaF, raulcd and rok as code owners December 18, 2025 06:27

github-actions bot added Component: Python awaiting review Awaiting review labels Dec 18, 2025

NickCrews mentioned this pull request Dec 18, 2025

[Python] Support inferring nested ndarray with ndim > 1 #22081

Open

NickCrews marked this pull request as draft December 18, 2025 06:40

Fix Python/Cython formatting (remove trailing whitespace)

4136be2

Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

Fix tests: compare against tolist() results for correct type matching

937b64c

Co-authored-by: NickCrews <10820686+NickCrews@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

GH-22081: [Python] Support reading numpy arrays with ndims > 1 into pa arrays of Lists #48581

GH-22081: [Python] Support reading numpy arrays with ndims > 1 into pa arrays of Lists #48581

Uh oh!

NickCrews commented Dec 18, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

NickCrews commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GH-22081: [Python] Support reading numpy arrays with ndims > 1 into pa arrays of Lists #48581

Are you sure you want to change the base?

GH-22081: [Python] Support reading numpy arrays with ndims > 1 into pa arrays of Lists #48581

Uh oh!

Conversation

NickCrews commented Dec 18, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

github-actions bot commented Dec 18, 2025

Uh oh!

NickCrews commented Dec 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

NickCrews commented Dec 18, 2025 •

edited by github-actions bot

Loading