Fix Enzyme sparse matrix sparsity pattern corruption (issue #835) #860
+81
−12
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
nzvalarrays instead of using broadcast operations that can change sparsityRoot Cause
Enzyme.make_zerocreates shadow sparse matrices that share the structural arrays (rowval,colptr) with the primal matrix. When the reverse rule executes broadcast operations likedA .-= z * transpose(y), these can change the sparsity pattern of the shadow matrix, and because the structural arrays are shared, this inadvertently corrupts the primal matrix's structure.Evidence from investigation:
Solution
Add sparse-safe helper functions that dispatch on
AbstractSparseMatrix:_safe_add!(dst, src): For sparse matrices, adds values viadst.nzval .+= src.nzval_safe_zero!(A): For sparse matrices, zeros viafill!(A.nzval, 0)_sparse_outer_sub!(dA, z, y): For sparse matrices, only accumulates gradients into existing non-zero positions using a direct loop over the CSC structureThese preserve the sparsity pattern by operating only on the
nzvalarray.Test plan
AssertionError: _goodbuffers(S)crashFixes #835
🤖 Generated with Claude Code