Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .markdownlint-cli2.jsonc
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,6 @@
"node_modules/**",
"target/**",
"**/_temp/**",
"**/agent/**"
"**/_ai/**"
]
}
11 changes: 3 additions & 8 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,17 +82,12 @@ For Rust packages, you can add features as needed with `--all-features`, specifi

## Contextual Rules

CRITICAL: For the files referenced below (e.g., @rules/general.md), use your Read tool to load it on a need-to-know basis, ONLY when relevant to the SPECIFIC task at hand.
CRITICAL: For the files referenced below, use your Read tool to load it on a need-to-know basis, ONLY when relevant to the SPECIFIC task at hand:

- @.config/agents/rules/*.md

Instructions:

- Do NOT preemptively load all references - use lazy loading based on actual need
- When loaded, treat content as mandatory instructions that override defaults
- Follow references recursively when needed

Rule files:

- @.config/agents/rules/ark-ui.md
- @.config/agents/rules/mastra.md
- @.config/agents/rules/panda-css.md
- @.config/agents/rules/zod.md
3 changes: 0 additions & 3 deletions apps/hash-ai-agent/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,5 @@ dist
*.db
*.db-*

# Fixtures
src/mastra/fixtures/entity-schemas/*.json

# Quokka files
*.quokka.*
208 changes: 208 additions & 0 deletions apps/hash-ai-agent/_ai/wiki/conditional-branching.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
# Conditional Branching in Plan Execution

> **Status**: Deferred — see [gaps-and-next-steps.md](./gaps-and-next-steps.md)
> **Created**: 2024-12-18
> **Moved to wiki**: 2024-12-19
> **Context**: Design options for runtime branching in plan execution

## Overview

Conditional branching allows plan execution to take different paths based on
runtime evaluation results. This is essential for:

- **Evaluation → retry cycles**: When synthesis/evaluation step determines quality is insufficient
- **Quality gates**: Pass/fail/escalate decisions at key checkpoints
- **Human-in-the-loop decision points**: Routing to human review when confidence is low
- **Adaptive execution**: Choosing different strategies based on intermediate results

## Current State

The plan compiler (`plan-compiler.ts`) currently supports:

- Linear execution (`.then()`)
- Parallel execution (`.parallel()`)
- Fan-in patterns (multiple steps → single synthesis)
- Fan-out patterns (single step → multiple parallel steps)

**Not yet supported**:

- Conditional branching (`.branch()`)
- Loop constructs (`.dowhile()`, `.dountil()`)

## Mastra Primitive

Mastra workflows support conditional branching via `.branch()`:

```typescript
workflow.branch([
[async ({ inputData }) => inputData.decision === "retry", retryStep],
[async ({ inputData }) => inputData.decision === "pass", continueStep],
[async ({ inputData }) => true, fallbackStep], // default case
]);
```

The condition functions receive the output of the previous step and return a boolean.
The first matching condition determines which branch is taken.

## PlanSpec Extension Options

### Option A: Explicit Conditional Edges

Add a `conditionalEdges` array to PlanSpec with serializable condition specs:

```typescript
interface ConditionSpec {
field: string; // Path in previous step output, e.g., "decision"
operator: "eq" | "neq" | "gt" | "lt" | "in" | "contains";
value: unknown; // Value to compare against
}

interface ConditionalEdge {
id: string;
fromStepId: string; // Source step (typically an evaluation step)
conditions: Array<{
condition: ConditionSpec;
toStepId: string; // Target step if condition matches
}>;
defaultStepId?: string; // Fallback if no condition matches
}

// In PlanSpec:
interface PlanSpec {
// ... existing fields ...
conditionalEdges?: ConditionalEdge[];
}
```

**Pros**:

- Explicit and declarative
- Easy to validate statically
- Clear visualization in UI

**Cons**:

- Adds complexity to PlanSpec schema
- Condition language is limited (no arbitrary expressions)

### Option B: Gateway Steps

Use evaluation steps that output decisions, followed by a special "gateway" step type:

```typescript
interface GatewayStep {
type: "gateway";
id: string;
dependsOn: [string]; // Must depend on exactly one step
routes: Array<{
condition: ConditionSpec;
toStepId: string;
}>;
defaultRoute?: string;
}
```

**Pros**:

- Gateway is a first-class step type
- Clearer semantic meaning
- Easier to reason about in isolation

**Cons**:

- More verbose plans
- Gateway steps don't "do" anything (just routing)

### Option C: Inline Conditions on dependsOn

Extend `dependsOn` to optionally include conditions:

```typescript
interface ConditionalDependency {
stepId: string;
condition?: ConditionSpec; // Only proceed if condition matches
}

// In PlanStep:
dependsOn: Array<string | ConditionalDependency>;
```

**Pros**:

- Minimal schema changes
- Natural extension of existing pattern

**Cons**:

- Makes dependency analysis more complex
- Harder to visualize

## Relationship to Decision Points

Earlier discussion identified the need for plans to surface uncertainty:

- Assumptions
- Missing inputs
- Clarifying questions
- Risks / decision points

These are currently tracked in `unknownsMap` as metadata. Decision points for
conditional branching could be:

| Decision Type | How It Maps to Branching |
| --------------- | ---------------------------------------------- |
| `clarification` | Branch to HITL step for user input |
| `assumption` | Branch based on assumption validation |
| `risk` | Branch to mitigation path if risk materializes |
| `tradeoff` | Branch based on user preference or heuristic |

The relationship between `unknownsMap` and conditional edges needs further design:

- Should decision points automatically generate conditional edges?
- Or should they remain separate (metadata vs control flow)?

**Tracked for future**: Clarify this relationship when implementing Phase 4.

## Security Considerations

Condition evaluation must be carefully constrained:

1. **No arbitrary code execution**: Conditions must be declarative, not executable JS
2. **Limited operators**: Only safe comparison operators
3. **Field path validation**: Ensure field paths don't access sensitive data
4. **Timeout protection**: Condition evaluation should be bounded

## Deferred Because

1. **Current focus**: Phase 1-3 focus on basic DAG execution with streaming
2. **Schema design**: Need to finalize which option (A, B, or C) to pursue
3. **Validation complexity**: Conditional edges require additional validation:
- All branches must be reachable
- No orphaned steps after conditions
- Conditions must be evaluable given step outputs
4. **UI implications**: Conditional branches need visualization support

## Implementation Plan (Phase 4)

When we return to this:

1. **Design decision**: Choose between Option A, B, or C (likely A)
2. **Schema extension**: Add conditional edges to PlanSpec
3. **Validation**: Extend plan-validator.ts for conditional edge checks
4. **Compiler**: Implement `.branch()` generation from conditional edges
5. **Tests**: Add conditional branching test fixtures
6. **Topology**: Update topology-analyzer.ts to handle conditional paths
7. **Streaming**: Emit events for branch decisions

## Related Files

- `src/mastra/tools/plan-compiler.ts` - Main compiler (needs `.branch()` support)
- `src/mastra/schemas/plan-spec.ts` - Schema (needs conditional edge types)
- `src/mastra/tools/plan-validator.ts` - Validation (needs edge validation)
- `src/mastra/tools/topology-analyzer.ts` - Analysis (needs conditional path handling)

## References

- Mastra workflow `.branch()` documentation
- Earlier conversation about decision points and surfaced uncertainty
- XState/Stately statechart patterns (for future consideration)
121 changes: 121 additions & 0 deletions apps/hash-ai-agent/_ai/wiki/deployment-requirements.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
# Deployment Requirements: Mastra Workflow State Management

> Technical requirements for deploying human-in-the-loop workflows with Mastra.
> Captured 2024-12-22. Source: Analysis of Mastra core (vNext/Evented model).

## Key Finding: Storage is Required for Human-in-the-Loop

**In-memory state is sufficient for single-run workflows**, but **storage is essential for suspend/resume across execution sessions**.

### Why This Matters

- **Single-run workflow**: Steps pass outputs in-memory via accumulated `stepResults` object. No database involved.
- **Human-in-the-loop workflow**: Step calls `suspend()`, workflow state persists to storage, process can terminate. Hours/days later, `resume()` loads state from storage and continues.

Without storage configured, suspended workflows lose all state on process restart.

## Storage Requirements by Use Case

| Use Case | Storage Required? |
|----------|-------------------|
| Workflows completing in single execution | No |
| Human approval gates | **Yes** |
| Long-running workflows (survive restarts) | **Yes** |
| External webhook callbacks | **Yes** |
| Audit trail / workflow history queries | **Yes** |

## Configuring Storage

```typescript
import { Mastra } from "@mastra/core";
import { PostgresStore } from "@mastra/pg";

const mastra = new Mastra({
storage: new PostgresStore({
connectionString: process.env.DATABASE_URL
})
});
```

### Supported Backends

- **PostgreSQL** - production recommended
- **LibSQL/SQLite** - local development, serverless edge
- **Custom** - implement `BaseStorage` interface

## Suspend/Resume Pattern

### Suspending a Workflow

```typescript
const approvalStep = createStep({
id: "request-approval",
inputSchema: z.object({ proposal: z.string() }),
suspendSchema: z.object({
reason: z.string(),
context: z.record(z.unknown())
}),
resumeSchema: z.object({
approved: z.boolean(),
notes: z.string().optional()
}),
execute: async ({ inputData, resumeData, suspend }) => {
// If not yet approved, suspend and wait
if (!resumeData?.approved) {
await suspend({
reason: "Human approval required",
context: { proposal: inputData.proposal }
});
return; // Execution stops here
}

// Resumed with approval
return { approved: true, notes: resumeData.notes };
}
});
```

### Resuming a Workflow

```typescript
// Later, when human approves via API/UI:
const run = await workflow.getRunById(runId);
const result = await run.resume({
resumeData: { approved: true, notes: "LGTM" }
});
```

### Querying Suspended Workflows

```typescript
// Find workflows awaiting human input
const pending = await mastra.getStorage().getWorkflowRuns({
workflowName: "approval-workflow",
status: "suspended"
});
```

## vNext (Evented) Model

The evented execution model treats suspend/resume as first-class workflow states:

- **Suspend** publishes `workflow.suspend` event, persists snapshot
- **Resume** publishes `workflow.resume` event, loads snapshot, continues from `suspendedPaths`
- **External systems** can subscribe to events for notifications

This model is cleaner for human-in-the-loop because state transitions are explicit events rather than implicit control flow.

## Implementation Checklist

1. **Configure storage backend** (PostgreSQL for production)
2. **Define `suspendSchema`** - what context surfaces to human reviewer
3. **Define `resumeSchema`** - validate human input on resume
4. **Build resume trigger** - API endpoint or UI action
5. **Handle edge cases** - abandoned workflows, failed resumes, timeouts

## Open Questions

1. **Storage backend**: PostgreSQL vs LibSQL for our deployment?
2. **Resume mechanism**: API endpoint, UI component, or event subscription?
3. **Suspend payload design**: What information do human reviewers need?
4. **Workflow lifecycle**: How to handle abandoned/expired suspended workflows?
Loading
Loading