Skip to content

Conversation

@parmesant
Copy link
Contributor

@parmesant parmesant commented Dec 15, 2025

add fields workspace_id and org_id

Fixes #XXXX.

Description


This PR has:

  • been tested to ensure log ingestion and log query works.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added documentation for new or modified features or behaviors.

Summary by CodeRabbit

  • New Features
    • Server logs now include workspace and organization identifiers for improved tracking and filtering across your infrastructure.
    • Log parsing and formatting updated to recognize the new identifiers while preserving existing timestamp, level, and message fields so downstream behavior remains consistent.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 15, 2025

Walkthrough

Added two captured fields, workspace_id and org_id, to the first regex entry of parseable_server_logs in resources/formats.json, and updated test fixtures in src/event/format/known_schema.rs to include the expanded log-prefix field count.

Changes

Cohort / File(s) Summary
Formats config
resources/formats.json
Updated the first regex entry under parseable_server_logs to insert (?P<workspace_id>\S+) and (?P<org_id>\S+) after deployment_id, and added workspace_id and org_id to the fields array, shifting subsequent fields.
Tests
src/event/format/known_schema.rs
Updated test log strings and assertions to include the additional leading identifiers (workspace_id, org_id) so parsed field checks align with the new regex/fields order.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

  • Review points:
    • Verify regex capture group order in resources/formats.json matches the fields array.
    • Confirm test strings in src/event/format/known_schema.rs correctly reflect the new prefix layout and that assertions validate intended parsed fields.
    • Search for other consumers of parseable_server_logs that might assume the previous field order.

Possibly related PRs

Suggested labels

for next release

Suggested reviewers

  • nitisht

Poem

🐰 I hop through logs at break of day,
Two new names join the parsing play.
workspace_id, org_id — what a pair!
I nibble tokens with careful care,
Happy trails in every log I bear. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description check ⚠️ Warning The description is largely incomplete. It only fills the opening line with the feature being added but leaves all substantive sections (goal, solutions, key changes) empty with only template comments remaining. Replace template comments with actual details: describe the goal of adding these fields, explain the rationale and solutions considered, document key changes made, and confirm completion of the testing and documentation checklist items.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: updating the parseable_server_logs schema to add workspace_id and org_id fields.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2ad76cc and 857b032.

📒 Files selected for processing (2)
  • resources/formats.json (1 hunks)
  • src/event/format/known_schema.rs (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/event/format/known_schema.rs
  • resources/formats.json
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: coverage
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]
coderabbitai bot previously approved these changes Dec 15, 2025
"name": "parseable_server_logs",
"regex": [
{
"pattern": "^(?P<customer_id>\\S+)\\s+(?P<deployment_id>\\S+)\\s+(?P<timestamp>\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}\\.\\d+Z?)\\s+(?P<level>\\w+)\\s+(?P<logger_context>\\S+)\\s+(?P<thread_id>ThreadId\\(\\d+\\))\\s+(?P<module>.*?):(?P<line_number>\\d+):\\s+(?P<body>.*)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add workspace_id and org_id in regex pattern also

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/event/format/known_schema.rs (1)

569-585: Add assertions to verify extraction of workspace_id and org_id fields.

The test data has been updated to include workspace_id and org_id at the beginning of each log line, but the test assertions don't verify that these fields are successfully extracted. This creates a test coverage gap where extraction failures for these new fields would go undetected.

Apply this diff to add assertions for the new fields:

             assert!(
                 obj.contains_key("module"),
                 "Missing module field for log {}",
                 i + 1
             );

             // ThreadId and line_number are only present in the first format (logs 0-3)
             if i < 4 {
+                assert!(
+                    obj.contains_key("workspace_id"),
+                    "Missing workspace_id field for log {}",
+                    i + 1
+                );
+                assert!(
+                    obj.contains_key("org_id"),
+                    "Missing org_id field for log {}",
+                    i + 1
+                );
                 assert!(
                     obj.contains_key("logger_context"),
                     "Missing logger_context field for log {}",
                     i + 1
                 );
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between efa2c6f and 2ad76cc.

📒 Files selected for processing (2)
  • resources/formats.json (1 hunks)
  • src/event/format/known_schema.rs (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • resources/formats.json
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: resources/formats.json:1469-1484
Timestamp: 2025-09-18T10:08:05.101Z
Learning: The "rust_server_logs" format in resources/formats.json was intentionally renamed to "parseable_server_logs" with no backward compatibility concerns because it was unreleased and had no external users depending on it. This was confirmed by nikhilsinhaparseable on PR 1415.
📚 Learning: 2025-09-18T10:08:05.101Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: resources/formats.json:1469-1484
Timestamp: 2025-09-18T10:08:05.101Z
Learning: The "rust_server_logs" format in resources/formats.json was intentionally renamed to "parseable_server_logs" with no backward compatibility concerns because it was unreleased and had no external users depending on it. This was confirmed by nikhilsinhaparseable on PR 1415.

Applied to files:

  • src/event/format/known_schema.rs
📚 Learning: 2025-09-18T09:59:20.177Z
Learnt from: nikhilsinhaparseable
Repo: parseablehq/parseable PR: 1415
File: src/metrics/mod.rs:700-756
Timestamp: 2025-09-18T09:59:20.177Z
Learning: In src/event/mod.rs, the parsed_timestamp used in increment_events_ingested_by_date() is correctly UTC-normalized: for dynamic streams it remains Utc::now(), and for streams with time partition enabled it uses the time partition value. Both cases result in proper UTC date strings for metrics labeling, preventing double-counting issues.

Applied to files:

  • src/event/format/known_schema.rs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: coverage
  • GitHub Check: Quest Smoke and Load Tests for Distributed deployments
  • GitHub Check: Build Default x86_64-pc-windows-msvc
  • GitHub Check: Build Default aarch64-unknown-linux-gnu
  • GitHub Check: Build Default x86_64-unknown-linux-gnu
  • GitHub Check: Quest Smoke and Load Tests for Standalone deployments
  • GitHub Check: Build Default aarch64-apple-darwin
  • GitHub Check: Build Kafka x86_64-unknown-linux-gnu
  • GitHub Check: Build Kafka aarch64-apple-darwin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants