-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[arrow] Minimize allocation in GenericViewArray::slice() #9016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| nulls.shrink_to_fit(); | ||
| } | ||
| */ | ||
| todo!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you mean to leave this todo here? It seems to me that it might be important to finish this part. If it isn't necessary could you add a comment explaining why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is also bad we have no test coverage for this. we should make a PR to add some
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! I didn't mean to do that. I've pushed a fix. But at least it caught this test coverage miss? 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! I didn't mean to do that. I've pushed a fix. But at least it caught this test coverage miss? 😅
Yes indeed -- it is bad we don't have a test for this
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @maxburke and @mhilton -- this is a really cool PR. I like it a lot
I think once we sort out the "shrink_to_fit" bit it will be ready to go
cc @XiangpengHao and @Dandandan
| } | ||
|
|
||
| fn shrink_to_fit(&mut self) { | ||
| /* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can use Arc::make_mut here to modify (or clone) the buffers here as needed
fn shrink_to_fit(&mut self) {
self.views.shrink_to_fit();
Arc::make_mut(&mut self.buffers).iter_mut().for_each(|b| b.shrink_to_fit());
if let Some(nulls) = &mut self.nulls {
nulls.shrink_to_fit();
}
}I think the call to
self.buffers.shrink_to_fit();which would shrink the Vec today doesn't really have an analog when the view has an Arc of the slice -- we could potentially call shrink_to_fit that prior to creating the Arc but I think that might be unecessary.
I would personally suggest not messing with the actual self.buffers call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I just committed a fix before you posted this 😅
Is shrink_to_fit a best-effort operation? If so, it's probably not necessary to try to shrink self.buffers...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah my review overlapped with your push 🏃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is shrink_to_fit a best-effort operation? If so, it's probably not necessary to try to shrink self.buffers...?
I think it is reasonable behavior. Maybe we can leave a comment explaining the rationale ('best effort' or something) -- the memory that will be saved from a few elements in a Vec is not likely high -- especially as slicing a generic view array doesn't adjust the buffers anyways
| let len = array.len(); | ||
| array.buffers.insert(0, array.views.into_inner()); | ||
|
|
||
| let mut buffers = array.buffers.iter().cloned().collect::<Vec<_>>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could call to_vec here
| let mut buffers = array.buffers.iter().cloned().collect::<Vec<_>>(); | |
| let mut buffers = array.buffers.into_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
Since buffers is exposed I think this is an API change that will have to wait for the next major release (in Feb) |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @maxburke -- this looks great to me. I do think some small finagling of shrink_to_fit would be useful too
| /// | ||
| /// Panics if [`GenericByteViewArray::try_new`] returns an error | ||
| pub fn new(views: ScalarBuffer<u128>, buffers: Vec<Buffer>, nulls: Option<NullBuffer>) -> Self { | ||
| pub fn new<U>(views: ScalarBuffer<u128>, buffers: U, nulls: Option<NullBuffer>) -> Self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very elegant. It is nice to keep this API backwards compatible (anything that used to compile still compiles).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
|
|
||
| /// Deconstruct this array into its constituent parts | ||
| pub fn into_parts(self) -> (ScalarBuffer<u128>, Vec<Buffer>, Option<NullBuffer>) { | ||
| pub fn into_parts(self) -> (ScalarBuffer<u128>, Arc<[Buffer]>, Option<NullBuffer>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this I think is a breaking API change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tested it with Datafusion and it dropped right in (both the mainline version and the hacked up and patched version we're using).
But, yeah, I can't speak for other users of the package.
| self.buffers.iter_mut().for_each(|b| b.shrink_to_fit()); | ||
| self.buffers.shrink_to_fit(); | ||
|
|
||
| if let Some(buffers) = Arc::get_mut(&mut self.buffers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will only shrink the buffers when there are no other outstanding references to this code. I think it would be better to call Arc::make_mut here to ensure that the buffers get shrunken
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the issue I have with Arc::make_mut is that if I slice array I now have two references to the underlying buffers. If I call shrink_to_fit on one of them, Arc::make_mut will clone the buffers and then shrink_to_fit will be called on one of them. But because the underlying buffers end up being cloned, it doesn't make the original allocation shrink and in the end it'll end up using more memory, until the other reference is dropped.
Additionally it'll create more allocator pressure because the buffer cloning will duplicate the buffer at it's pre-shrunken size before it's shrunk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense -- let's keep it this way then. I do think it is worth a comment explaining the rationale, though, for future readers that may wonder the same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done 👍
|
run benchmark view_types |
|
🤖 |
|
🤖: Benchmark completed Details
|
Use the suggested Arc<[Buffer]> storage for ViewArray storage instead of an owned Vec<Buffer> so that the slice clone does not allocate.
Not bad 😎 |
|
run benchmark view_types |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤔 it actually looks like GC'ing is slowing down -- probably from the need to do an extra allocation for new buffers |
Upon some more thought I think an extra allocation in GC is a better tradeoff, especially since a lot of GC'ing in downstream systems like DataFusion actually happens as part of concat and coalesce kernels @XiangpengHao and @ctsk do you haven any thoughts on this tradeoff? |
|
run benchmark coalesce_kernels concatenate_kernel |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
run benchmark coalesce_kernels concatenate_kernel view_types |
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
|
🤖 |
|
🤖: Benchmark completed Details
|
Use the suggested Arc<[Buffer]> storage for ViewArray storage instead of an owned Vec so that the slice clone does not allocate.
Which issue does this PR close?
StringViewArray::slice()andBinaryViewArray::slice()are slow (they allocate) #6408 .