Skip to content

Conversation

@tkattkat
Copy link
Collaborator

@tkattkat tkattkat commented Dec 5, 2025

why

  • when transitioning to v3, we did not use the latest version of screenshot collector
  • screenshot collector currently fails due to not having page.on and page.off support for the load, and domcontentloaded events.

what changed

  • added latest version of screenshot collector

test plan

  • ran evals in cli with additional logging to also verify everything is working as expected

Summary by cubic

Updated the evals CLI screenshot collector to the latest version, adding image-diff filtering and compatibility with v3 Stagehand pages where navigation events are disabled by default. This reduces duplicate screenshots and stabilizes capture during eval runs.

  • New Features

    • Skip similar screenshots using MSE/SSIM thresholds with sharp.
    • Non-blocking initial/final captures and safer interval capture with error handling.
    • Added addScreenshot method for manual collection.
  • Dependencies

    • Added sharp ^0.34.5 for image processing.
    • Patch bump via changeset for @browserbasehq/stagehand-evals.

Written for commit c0131eb. Summary will update automatically on new commits.

@changeset-bot
Copy link

changeset-bot bot commented Dec 5, 2025

🦋 Changeset detected

Latest commit: c0131eb

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@browserbasehq/stagehand-evals Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 4 files

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Dec 5, 2025

Greptile Overview

Greptile Summary

Updates the ScreenshotCollector utility for V3 compatibility and adds intelligent screenshot deduplication.

  • Switched from generic ScreenshotCapablePage interface to the concrete V3 Page type from @browserbasehq/stagehand
  • Added sharp library for image processing to enable screenshot deduplication via MSE (Mean Squared Error) and SSIM (Structural Similarity Index) comparisons
  • Disabled navigation event listeners (page.on("load"), etc.) since V3 pages don't support these yet - documented with TODO comment
  • Changed default captureOnNavigation from true to false to reflect current V3 limitations
  • Added new addScreenshot() method for manual screenshot additions with deduplication logic
  • Improved error handling with .catch() for async operations instead of fire-and-forget calls

Confidence Score: 4/5

  • This PR is safe to merge with minor considerations around the async final screenshot behavior
  • The changes are well-structured and address real V3 compatibility issues. The image deduplication logic is sound. One minor issue exists where the final screenshot in stop() may not be captured before returning, but this is unlikely to cause problems in practice since the collector is typically used for incremental captures.
  • packages/evals/utils/ScreenshotCollector.ts - review async final screenshot handling in stop() method

Important Files Changed

File Analysis

Filename Score Overview
packages/evals/utils/ScreenshotCollector.ts 4/5 Major update adding image deduplication via MSE/SSIM, switching to V3 Page type, and adding sharp dependency for image comparison. Minor issue with async final screenshot not being awaited in stop().
packages/evals/package.json 5/5 Added sharp ^0.34.5 dependency for image processing used in screenshot deduplication.
.changeset/beige-taxes-punch.md 5/5 Standard changeset for patch update to stagehand-evals package.
pnpm-lock.yaml 5/5 Lock file updates for sharp 0.34.5 and its native dependencies across platforms.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant ScreenshotCollector
    participant Page
    participant Sharp

    Caller->>ScreenshotCollector: start()
    ScreenshotCollector->>Page: screenshot() [initial]
    Page-->>ScreenshotCollector: Buffer
    ScreenshotCollector->>ScreenshotCollector: store screenshot
    
    loop Every interval (5000ms default)
        ScreenshotCollector->>Page: screenshot()
        Page-->>ScreenshotCollector: Buffer
        ScreenshotCollector->>Sharp: resize & compare (MSE)
        Sharp-->>ScreenshotCollector: MSE value
        alt MSE >= threshold
            ScreenshotCollector->>Sharp: calculate SSIM
            Sharp-->>ScreenshotCollector: SSIM value
            alt SSIM < threshold (different enough)
                ScreenshotCollector->>ScreenshotCollector: store screenshot
            end
        end
    end

    Caller->>ScreenshotCollector: stop()
    ScreenshotCollector->>Page: screenshot() [final, async]
    ScreenshotCollector-->>Caller: Buffer[]
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +81 to 85
// Capture final screenshot without blocking
this.captureScreenshot("final").catch((error) => {
console.error("Failed to capture final screenshot:", error);
});
return this.getScreenshots();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: The final screenshot capture is fire-and-forget, so getScreenshots() returns before the final screenshot is captured. If the caller needs the final screenshot included, this won't work as expected. Consider making stop() async and awaiting the capture, or document that the final screenshot is best-effort.

Suggested change
// Capture final screenshot without blocking
this.captureScreenshot("final").catch((error) => {
console.error("Failed to capture final screenshot:", error);
});
return this.getScreenshots();
// Capture final screenshot without blocking
this.captureScreenshot("final").catch((error) => {
console.error("Failed to capture final screenshot:", error);
});
// Note: final screenshot may not be included as it's captured asynchronously
return this.getScreenshots();
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/evals/utils/ScreenshotCollector.ts
Line: 81:85

Comment:
**logic:** The final screenshot capture is fire-and-forget, so `getScreenshots()` returns before the final screenshot is captured. If the caller needs the final screenshot included, this won't work as expected. Consider making `stop()` async and awaiting the capture, or document that the final screenshot is best-effort.

```suggestion
    // Capture final screenshot without blocking
    this.captureScreenshot("final").catch((error) => {
      console.error("Failed to capture final screenshot:", error);
    });
    // Note: final screenshot may not be included as it's captured asynchronously
    return this.getScreenshots();
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants