-
-
Notifications
You must be signed in to change notification settings - Fork 157
fix query scan metrics #1483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
parmesant
wants to merge
2
commits into
parseablehq:main
Choose a base branch
from
parmesant:query-scan-fix
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
fix query scan metrics #1483
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🌐 Web query:
DataFusion 51.0.0 ParquetExec metrics "bytes_scanned" metric name and sum_by_name return type💡 Result:
Do you mean the metric named "bytes_scanned" added in DataFusion 51.0.0 (yes — name is bytes_scanned)? Which API do you mean by sum_by_name (the Rust function/crate or the CLI/EXPLAIN display)? I can look up the exact return type once you confirm which sum_by_name you mean. Source: DataFusion 51.0.0 release notes. [1]
🏁 Script executed:
Repository: parseablehq/parseable
Length of output: 148
🏁 Script executed:
Repository: parseablehq/parseable
Length of output: 9319
🏁 Script executed:
Repository: parseablehq/parseable
Length of output: 367
🌐 Web query:
DataFusion metrics.sum_by_name return type ScalarValue u64💡 Result:
DataFusion’s metrics API (metrics.sum_by_name) returns a ScalarValue (a dynamically-typed nullable value). For a metric that is a sum it is represented as a u64 inside a ScalarValue::UInt64 (i.e. logically a u64). To get the raw u64 you must match/cast the ScalarValue.
Example:
References: DataFusion ScalarValue docs and Metrics user guide. [1][2]
Sources
[1] ScalarValue enum (datafusion docs).
[2] DataFusion metrics user guide.
🌐 Web query:
DataFusion 51.0.0 ExecutionPlan metrics API u64 accessor bytes_scanned💡 Result:
Short answer — call ExecutionPlan::metrics(), find the metric named "bytes_scanned" and read its MetricValue::UInt64. Example (DataFusion 51.0.0):
Notes:
Sources:
[1] DataFusion — Reading Explain Plans / EXPLAIN ANALYZE (shows bytes_scanned metric).
[2] datafusion crate docs — physical_plan::metrics (MetricValue / metrics API).
🏁 Script executed:
Repository: parseablehq/parseable
Length of output: 487
🏁 Script executed:
Repository: parseablehq/parseable
Length of output: 119
🌐 Web query:
DataFusion ScalarValue as_usize() method truncation 32-bit u64💡 Result:
Short answer: yes — when DataFusion (or your code) converts a ScalarValue::UInt64 to usize it uses a plain cast to usize which will truncate on 32‑bit targets. This is a consequence of Rust's
ascasting semantics (it silently truncates when converting a larger integer to a smaller one) and the fact thatusizesize is target‑dependent. Use a fallible conversion (e.g.usize::try_from(u64)/TryFrom/TryInto) or explicitly checkvalue <= usize::MAXbefore converting to avoid silent truncation. (docs.rs)Suggested pattern:
let n = u64_val.try_into().map_err(|_| ...)?;(orusize::try_from(u64_val))if u64_val <= usize::MAX as u64 { let n = u64_val as usize; } else { /* handle overflow */ }References:
asvsFrom/TryFrom. (stackoverflow.com).as_usize()and related concerns. (mail-archive.com)Citations:
🏁 Script executed:
Repository: parseablehq/parseable
Length of output: 10510
🏁 Script executed:
Repository: parseablehq/parseable
Length of output: 50378
Unsafe truncation when summing bytes scanned — prefer direct u64 extraction.
metrics.sum_by_name()returns aScalarValue(not rawu64), and callingas_usize()silently truncates on 32-bit systems. Instead ofscanned.as_usize() as u64, match theScalarValue::UInt64variant directly:The metric name
"bytes_scanned"is confirmed as stable in DataFusion 51.0.0.🤖 Prompt for AI Agents