Skip to content

Conversation

@ChrisRackauckas-Claude
Copy link
Contributor

Summary

  • Fixes the issue where Enzyme AD with sparse matrices corrupts the primal matrix's sparsity pattern (rowval, colptr)
  • Adds sparse-safe helper functions that operate directly on nzval arrays instead of using broadcast operations that can change sparsity
  • For dense matrices, falls back to standard broadcast operations

Root Cause

Enzyme.make_zero creates shadow sparse matrices that share the structural arrays (rowval, colptr) with the primal matrix. When the reverse rule executes broadcast operations like dA .-= z * transpose(y), these can change the sparsity pattern of the shadow matrix, and because the structural arrays are shared, this inadvertently corrupts the primal matrix's structure.

Evidence from investigation:

Primal rowval === shadow rowval? true
Primal colptr === shadow colptr? true

Solution

Add sparse-safe helper functions that dispatch on AbstractSparseMatrix:

  1. _safe_add!(dst, src): For sparse matrices, adds values via dst.nzval .+= src.nzval
  2. _safe_zero!(A): For sparse matrices, zeros via fill!(A.nzval, 0)
  3. _sparse_outer_sub!(dA, z, y): For sparse matrices, only accumulates gradients into existing non-zero positions using a direct loop over the CSC structure

These preserve the sparsity pattern by operating only on the nzval array.

Test plan

  • Verify sparsity pattern is preserved after AD (rowval, colptr, nnz unchanged)
  • Verify matrix can still be displayed without AssertionError: _goodbuffers(S) crash
  • Verify dense matrix Enzyme tests still pass (note: existing tests have pre-existing failures unrelated to this change)

Fixes #835

🤖 Generated with Claude Code

This fix addresses the issue where Enzyme AD with sparse matrices causes
the primal matrix's sparsity pattern (rowval, colptr) to be corrupted.

## Root Cause
Enzyme.make_zero shares the structural arrays (rowval, colptr) between
the primal and shadow sparse matrices. When broadcast operations like
`dA .-= z * transpose(y)` modify the shadow's sparsity pattern, they
inadvertently corrupt the primal's structure as well.

## Solution
Add sparse-safe helper functions that operate directly on nzval arrays:
- `_safe_add!`: Adds sparse matrices by operating on nzval
- `_safe_zero!`: Zeros sparse matrices by operating on nzval
- `_sparse_outer_sub!`: Accumulates outer product gradients only into
  existing non-zero positions

These functions dispatch on AbstractSparseMatrix for sparse matrices
and fall back to standard broadcast operations for dense matrices.

## Changes
- Import SparseArrays types for dispatch
- Replace broadcast operations with sparse-safe versions in forward
  and reverse rules
- Add documentation for new helper functions

Fixes SciML#835

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enzyme + SparseArrays + LinearSolve = Not Happy

2 participants