Skip to content

Feature sparse linalg solvers#2841

Open
abagusetty wants to merge 79 commits into
IntelPython:masterfrom
abagusetty:feature-sparse-linalg-solvers
Open

Feature sparse linalg solvers#2841
abagusetty wants to merge 79 commits into
IntelPython:masterfrom
abagusetty:feature-sparse-linalg-solvers

Conversation

@abagusetty
Copy link
Copy Markdown
Contributor

@abagusetty abagusetty commented Apr 9, 2026

Adds support for from dpnp.scipy.sparse.linalg import LinearOperator, cg, gmres, minres
Fixes: #2831

  • Have you provided a meaningful PR description?
  • Have you added a test, reproducer or referred to an issue with a reproducer?
  • Have you tested your changes locally for CPU and GPU devices?
  • Have you made sure that new changes do not introduce compiler warnings?
  • Have you checked performance impact of proposed changes?
  • Have you added documentation for your changes, if necessary?
  • Have you added your changes to the changelog?

abagusetty and others added 30 commits April 2, 2026 12:52
…oneMKL hooks

- _interface.py: add full operator algebra (.H, .T, +, *, **, neg),
  _AdjointLinearOperator, _TransposedLinearOperator, _SumLinearOperator,
  _ProductLinearOperator, _ScaledLinearOperator, _PowerLinearOperator,
  IdentityOperator, MatrixLinearOperator, _AdjointMatrixOperator,
  _CustomLinearOperator factory dispatch; extend aslinearoperator
  to handle dpnp sparse and dense arrays

- _iterative.py: add _make_system (dtype validation, preconditioner
  wiring, working dtype selection); add _make_fast_matvec CSR/oneMKL
  SpMV hook; fix GMRES Arnoldi inner product to single oneMKL BLAS
  gemv (dpnp.dot) instead of slow Python vdot loop; offload
  Hessenberg lstsq to numpy.linalg.lstsq (CPU, matches CuPy);
  fix SciPy host-fallback tol->rtol deprecation via _scipy_tol_kwarg;
  add preconditioner support to CG; keep MINRES as SciPy-backed stub

Refs: CuPy v14.0.1 cupyx/scipy/sparse/linalg/_interface.py,
      cupyx/scipy/sparse/linalg/_iterative.py"
…gmres, minres

Modeled after CuPy's cupyx_tests/scipy_tests/sparse_tests/test_linalg.py.
Covers:
  - LinearOperator: shape, dtype inference, matvec/rmatvec/matmat,
    subclassing, __matmul__, __call__, edge cases
  - aslinearoperator: dense array, duck-type, identity passthrough,
    rmatvec from dense, invalid inputs
  - cg: SPD convergence, scipy reference match, x0 warm start, b_ndim=2,
    callback, atol, LinearOperator path, invalid inputs,
    non-convergence info check
  - gmres: diag-dominant convergence, scipy reference match, restart
    variants, x0, b_ndim=2, callbacks, complex systems, atol,
    non-convergence info check, Hilbert-matrix stress test
  - minres: SPD, symmetric-indefinite, scipy reference, shift parameter,
    non-square guard, LinearOperator path, callback
  - Integration: parametric (n, dtype) cross-solver tests via LinearOperator
  - Import smoke tests: __all__ completeness
- Use dpnp.tests.helper: assert_dtype_allclose, generate_random_numpy_array,
  get_all_dtypes, get_float_complex_dtypes, has_support_aspect64
- Use dpnp.tests.third_party.cupy testing harness (with_requires, etc.)
- Use numpy.testing assert_allclose / assert_array_equal / assert_raises
- Use dpnp.asnumpy() instead of numpy.asarray()
- Use pytest parametrize ids matching existing test conventions
- Use is_scipy_available() helper from tests/helper.py
- Strict class-per-solver organisation matching TestCholesky / TestDet etc.
…or dtype

Two bugs fixed:
1. _init_dtype() was calling dpnp.zeros(n) which defaults to float64,
   so a float32 matvec would upcast and return float64, making the
   inferred dtype wrong.  Fix: use dpnp.zeros(n, dtype=dpnp.int8) as
   SciPy/CuPy do — any numeric matvec will promote int8 to its own dtype.
2. _CustomLinearOperator.__init__ called _init_dtype() even when an
   explicit dtype was already supplied, overwriting the caller's value.
   Fix: _init_dtype() now short-circuits when self.dtype is already set.
…ption handling

Align gemv.cpp with the conventions established in blas/gemm.cpp:

Headers added:
- ext/common.hpp         (dpctl_td_ns, consistent with other extensions)
- utils/memory_overlap.hpp   (MemoryOverlap guard on x vs y)
- utils/output_validation.hpp (CheckWritable + AmpleMemory on y)
- utils/type_utils.hpp       (validate_type_for_device<T> in impl)
- <sstream>                  (needed for stringstream error_msg)

Exception handling added in sparse_gemv_impl():
- try/catch(oneapi::mkl::exception) around all oneMKL sparse calls
- try/catch(sycl::exception) around all oneMKL sparse calls
- release_matrix_handle cleanup in the exception error path
- throw std::runtime_error with descriptive message on catch

Input validation added in sparse_gemv():
- ndim checks: x and y must be 1-D
- queues_are_compatible() across all 5 USM arrays
- MemoryOverlap()(x, y) aliasing guard
- CheckWritable::throw_if_not_writable(y)
- AmpleMemory::throw_if_not_ample(y, num_rows)
- keep_args_alive() at function exit (was missing, returning empty event)
… table

Modeled after blas/gemm.cpp (2-D table: value type x index type) and
blas/gemv.cpp (dispatch vector pattern with ContigFactory + init_dispatch_table).

Changes:
- Add sparse/types_matrix.hpp with SparseGemvTypePairSupportFactory<Tv, Ti>
  encoding the 4 supported combinations: {float32,float64} x {int32,int64}
- Rewrite sparse_gemv_impl() to take typeless char* pointers (matching
  the blas gemv_impl signature style) — type info flows through template
  params only, no runtime branching inside the impl
- Replace the 60-line if/else val_typenum/idx_typenum chain in sparse_gemv()
  with a 2-D dispatch table lookup (gemv_dispatch_table[val_id][idx_id])
- Rename init_sparse_gemv_dispatch_vector -> init_sparse_gemv_dispatch_table
  and implement it via init_dispatch_table<> from ext/common.hpp
- All validation guards and exception handling from prior commit are preserved
…se_gemv_dispatch_table

Follows the rename made in gemv.cpp when the dispatch mechanism was
changed from a 1-D vector to a 2-D table (value type x index type).
All other declarations (sparse_gemv signature, parameters) are unchanged.
The oneMKL 2025-2 sparse BLAS API deprecated the old 8-argument
set_csr_data(queue, handle, nrows, ncols, index_base, row_ptr, col_ind,
values, deps) overload in favour of a new signature that takes the
sparse matrix handle as `spmat` and adds an explicit `nnz` argument:

  set_csr_data(queue, spmat, nrows, ncols, nnz, index_base,
               row_ptr, col_ind, values, deps)

Fixes:
- Replace old set_csr_data call with the new nnz-aware signature
- Silences the resulting -Wunused-parameter warning on `nnz` (now used)
- No functional change; all other logic is unchanged
…tring

Line 477: `hasattr(A, "rmatmat\")` had a Markdown-escaped backslash
leaked into the Python source, causing an unterminated string literal.
Fixed to `hasattr(A, "rmatmat")`.
dpnp.ndarray blocks implicit NumPy conversion via __array__ to prevent
silent dtype=object arrays. All test assertions must use .asnumpy()
to materialize device arrays onto the host explicitly.

Also replaces numpy.asarray(x_dp) in _rel_residual helper.
…dation order

- _iterative.py: raise NotImplementedError for M != None *before* the
  _HOST_N_THRESHOLD SciPy fast-path in cg() and gmres(), so the contract
  is enforced regardless of system size (fixes test_cg_preconditioner_unsupported_raises,
  test_gmres_preconditioner_unsupported_raises).
- _iterative.py: validate callback_type and raise NotImplementedError for
  'pr_norm' *before* the _HOST_N_THRESHOLD branch in gmres(), so small-n
  systems also see the error (fixes test_gmres_callback_type_pr_norm_raises).
- _iterative.py: pass callback_type='legacy' to scipy.sparse.linalg.gmres
  when delegating on the fast path to suppress SciPy DeprecationWarning.
- test_scipy_sparse_linalg.py: add dtype=numpy.float64 to expected arange()
  calls in test_identity_operator and test_gmres_happy_breakdown so strict
  NumPy 2.0 dtype-equality checks pass (float64 result vs int64 expected).
- Replace .asnumpy() method calls with dpnp.asnumpy() module fn
  (asnumpy is not an ndarray method in dpnp; it is a top-level fn)
- Fix dpnp.any(x) ambiguous truth value in x0 zero-check; replace
  with explicit `x0 is not None` guard for r0 initialisation
- Fix V_mat.T.conj() -> dpnp.conj(V_mat.T) in GMRES Arnoldi step
- Guard minres beta sqrt against tiny negative floats: sqrt(abs(...))
- Unify GMRES Hessenberg h_np assignment to avoid .real stripping
  producing wrong dtype for complex systems
- Fix float() cast on dpnp scalar norm inside GMRES inner h_j1 line
…failures)

The committed code used hypot(gbar, oldb) as delta_k which is the
gamma (norm) from the PREVIOUS rotation step, not the correct diagonal
entry from applying the previous Givens rotation to the current column.

The correct Paige-Saunders (1975) two-rotation recurrence is:

  oldeps = epsln
  delta  = cs * dbar + sn * alpha   # apply previous rotation
  gbar_k = sn * dbar - cs * alpha   # residual -> new rotation input
  epsln  = sn * beta
  dbar   = -cs * beta

  gamma = hypot(gbar_k, beta)       # NEW rotation eliminates beta
  cs    = gbar_k / gamma
  sn    = beta   / gamma

  w_new = (v - oldeps*w - delta*w2) / gamma  # three-term update

This matches scipy.sparse.linalg.minres and Choi (2006) eq. 6.11.

The buggy recurrence produced solutions ~1.08x away from the true
solution (rel_err ~1e0) instead of the expected ~1e-13.

Co-authored-by: fix-minres-recurrence
@abagusetty
Copy link
Copy Markdown
Contributor Author

@antonwolfy Apologies for the delay. I got side tracked for a bit. The current workflow is failing with a sycl-post-link issue with 2026.0.0 compiler version during the link of libdpnp_backend_c.so and I see an other workflows are also affected by this.

Is this a known limitation and being triaged by compiler team

@abagusetty
Copy link
Copy Markdown
Contributor Author

Seems like it is: #2887

@antonwolfy
Copy link
Copy Markdown
Contributor

@abagusetty, yep, it's expected for the coverage due to known compiler bug, we will implement the w/a in #2887
Other workflows should pass now.

Comment thread dpnp/backend/extensions/sparse/CMakeLists.txt Outdated

pybind11_add_module(${python_module_name} MODULE ${_module_src})
add_sycl_to_target(TARGET ${python_module_name} SOURCES ${_module_src})

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# Ensure Cython modules build first so _usmarray.h exists
add_dependencies(${python_module_name} _usmarray)

Comment thread dpnp/backend/extensions/sparse/gemv.cpp Outdated
Comment thread dpnp/backend/extensions/sparse/gemv.cpp Outdated
Comment on lines +38 to +39
// dpctl namespace alias for type dispatch utilities
namespace dpctl_td_ns = dpnp::tensor::type_dispatch;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// dpctl namespace alias for type dispatch utilities
namespace dpctl_td_ns = dpnp::tensor::type_dispatch;
// namespace for operations with types
namespace dpnp_td_ns = dpnp::tensor::type_dispatch;

Comment thread dpnp/backend/extensions/sparse/types_matrix.hpp Outdated
Comment thread dpnp/backend/extensions/sparse/gemv.cpp Outdated
Comment thread CHANGELOG.md
* Added implementation of `dpnp.scipy.linalg.lu` (SciPy-compatible) [#2787](https://github.com/IntelPython/dpnp/pull/2787)
* Added support for ndarray subclassing via `dpnp.ndarray.view` method with `type` parameter [#2815](https://github.com/IntelPython/dpnp/pull/2815)
* Migrated tensor implementation from `dpctl.tensor` into `dpnp.tensor`, making `dpnp` the primary owner of the Array API-compliant tensor layer [#2856](https://github.com/IntelPython/dpnp/pull/2856)
* Added implementation of `dpnp.scipy.sparse.linalg import LinearOperator, cg, gmres, minres` [#2841](https://github.com/IntelPython/dpnp/pull/2841)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to move to 0.21.0 release

Comment thread dpnp/scipy/sparse/linalg/_interface.py Outdated
# Lazy import: dpnp.scipy.sparse may import this module during
# package initialisation, so we cannot import it at module scope.
# pylint: disable=import-outside-toplevel
from dpnp.scipy.sparse import issparse
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems issparse is missed and not implemented currently

# Helpers for constructing SPD, diagonally dominant, and symmetric
# indefinite test matrices. Kept small and local, matching the style of
# vvsort() at the top of test_linalg.py.
def _spd_matrix(n, dtype):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add CuPy tests to dpnp/tests/third_party/cupyx/scipy_tests/sparse_tests/test_linalg.py to check the compatibly?

abagusetty and others added 13 commits May 26, 2026 10:29
Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com>
Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com>
Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com>
Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com>
Co-authored-by: Anton <100830759+antonwolfy@users.noreply.github.com>
Closes the performance gap where csr_matrix.dot densified the matrix
via toarray() before every dpnp.dot, defeating the point of CSR
outside the solver fast path. cupyx routes csr_matrix.dot directly
through cuSPARSE SpMV; we now do the equivalent with oneMKL.

_csr.py
  - Add _ensure_spmv_handle: lazily allocates a oneMKL matrix_handle
    (set_csr_data + optimize_gemv, once) and caches it on the instance.
    Skipped when the compiled backend extension is absent or the
    (value, index) dtype pair is not in the dispatch table -- the
    instance silently uses the dense fallback in that case.
  - Add _spmv_supported predicate; tighter than the old check in
    _make_fast_matvec because it now validates both value AND index
    dtype against the C++ dispatch table, preventing a runtime
    ValueError from the backend for unsupported index types.
  - Rewrite dot(x) to dispatch sparse::gemv for 1-D x of matching
    dtype; preserve the dense fallback for 2-D x (batched SpMM is a
    separate oneMKL entry point not wired up yet).
  - Add __del__ to release the cached handle. The handle attrs are
    initialised at the very top of __init__ so __del__ never sees a
    partially constructed object even if an _init_* helper raises.
  - Add __matmul__ = dot so 'A @ x' takes the same fast path.

linalg/_iterative.py
  - _CachedSpMVPair.matvec now drives the csr's own cached handle
    instead of building a second forward handle, so a user-issued
    A.dot(x) and a solver-issued matvec(x) share a single handle.
    The adjoint handle is still owned by _CachedSpMV because csr
    only caches the forward direction.
  - _make_fast_matvec collapses to a probe of A._ensure_spmv_handle:
    one None-check replaces three (extension import, dtype guard,
    handle init try/except). The downstream behaviour is identical
    but the dtype check now covers indices, not just values.

linalg/_interface.py
  - Rewrite aslinearoperator in the canonical SciPy/CuPy dispatch
    order: LinearOperator -> sparse -> dense ndarray -> duck-type.
    Remove the unreachable second isinstance(dpnp.ndarray) block and
    the conflicting ndim policy (was both <=2 and ==2 in different
    branches). Promote 1-D dpnp arrays via atleast_2d to match CuPy.
  - Explicitly reject numpy.ndarray with a directed message instead
    of falling through to the generic 'type not understood'. Silent
    host->device upload would mask real device/queue selection bugs.
  - Drop the try/except (ImportError, AttributeError) wrapping the
    sparse import. dpnp.scipy.sparse is always present in this build;
    swallowing those exceptions hid real failures.

_base.py
  - Docstring only: document why the legacy isspmatrix / isspmatrix_csr
    aliases are intentionally omitted (dpnp has no spmatrix vs sparray
    discriminator, and modern SciPy/CuPy code is steered toward
    issparse + A.format == 'csr').
…; add cupyx tests

Five bug fixes informed by a full audit of LinearOperator, cg, gmres,
and minres against the upstream SciPy and cupyx reference
implementations, plus the cupyx-style compatibility tests requested in
the PR review.

_interface.py
  - _init_dtype: replace the float64 trial vector with int8. The
    float64 path silently upcast every dtype-inferred operator to
    float64, breaking float32 and complex64 workflows and contradicting
    the commit-message claim of an earlier fix that did not actually
    land. Matches scipy/cupyx semantics: int8 is the lowest-precedence
    numeric dtype and lets the matvec promote to its natural output
    dtype.
  - matvec / rmatvec: turn the second line of the dimension-mismatch
    error message into an f-string. Previously reported the literal
    text '{x.shape}' to users.

_iterative.py (cg)
  - Fix the info contract: previously returned -1 for an rz/pAp
    breakdown. SciPy and cupyx reserve info < 0 for illegal-input
    errors only and use info > 0 for any non-convergence (maxiter
    exhausted, breakdown, etc.). User code that branched on
    'if info > 0' was silently broken. Now report the iteration index
    at which the solver gave up.
  - Document the info contract in the docstring so the regression
    cannot resurface unnoticed.

_iterative.py (gmres)
  - Cast b_norm to host once via float(...) instead of leaving it as a
    0-D device tensor. The 'if b_norm == 0.0' guard and the per-iter
    'if r_norm <= atol' comparison previously triggered an implicit
    __bool__ sync each time they evaluated -- exactly the sync pattern
    the module docstring claims to avoid. Track r_norm_host explicitly
    so the loop condition and the post-loop info check operate on
    host scalars; the residual norm is still computed device-side via
    dpnp.linalg.norm.
  - Final 'not bool(r_norm <= atol)' replaced with the equivalent host
    comparison 'r_norm_host > atol'.

_iterative.py (minres)
  - Cast beta1 = float(dpnp.inner(r1, y)) before the < 0 and == 0
    guards. Previously the guards compared a 0-D device tensor to a
    Python int, forcing an implicit sync inside each branch.
  - Use numpy.sqrt on the already-host beta1 instead of dpnp.sqrt +
    float() round-trip.

tests/third_party/cupyx/scipy_tests/sparse_tests/test_linalg.py
  - New file (with sibling __init__.py for the package) requested in
    the PR review thread. Modelled on
    cupy/tests/cupyx_tests/scipy_tests/sparse_tests/test_linalg.py and
    fitted to dpnp's in-tree testing harness (dpnp as cupy, the
    dpnp.tests.third_party.cupy.testing helpers, scipy as the reference
    via testing.with_requires('scipy')).
  - Coverage:
      * LinearOperator: explicit-dtype preservation, int8 dtype
        inference, dimension-mismatch raise, __matmul__ dispatch, H
        property.
      * aslinearoperator: pass-through, dense dpnp.ndarray wrap,
        csr_matrix wrap, explicit numpy.ndarray rejection.
      * cg: dense-SPD convergence vs scipy, x0 warm-start, info > 0
        when maxiter is hit (guards the breakdown-contract fix above),
        zero-rhs returns zero.
      * gmres: diag-dominant convergence vs scipy, restart parameter,
        info > 0 when maxiter is hit.
      * minres: symmetric-indefinite convergence vs scipy, shift
        parameter, zero-rhs returns zero.
      * Module surface: import smoke test for the five PR-promised
        names (LinearOperator, aslinearoperator, cg, gmres, minres).
…ost Hessenberg lstsq

Closes audit items IntelPython#9, IntelPython#15, IntelPython#16, IntelPython#17, IntelPython#19 from the prior solver review.
All four touch the iterative-solver inner loop and were left out of the
previous correctness-only commit because they require a new C++ binding.

backend/extensions/blas/gemv.{cpp,hpp}, blas_py.cpp
  - Extend the typed gemv_impl signature with alpha and beta as
    doubles; the per-T impl casts them to the matrix value type at
    dispatch time. dpnp.dot and other legacy callers are unaffected
    -- the existing gemv() public function now forwards (1.0, 0.0) to
    the shared gemv_dispatch helper.
  - Add gemv_alpha_beta() entry point and a _gemv_alpha_beta pybind
    method computing y = alpha * op(A) * x + beta * y for caller-
    supplied scalars. Required by the GMRES Arnoldi fast path which
    fires gemv with (alpha=1, beta=0) writing into a Hessenberg
    column slice, then (alpha=-1, beta=1) fusing u -= V @ h into
    one kernel. For complex matrices the scalars are always one of
    {-1, 0, 1} and so survive the cast exactly; the impl docstring
    flags the silent imag-loss caveat for any other complex caller.
  - Refactor: hoist the existing validation / strides / dispatch
    plumbing into a static gemv_dispatch helper so both entry points
    share identical behaviour without duplication.

scipy/sparse/linalg/_iterative.py
  - _make_compute_hu now takes both V and H. The closure writes the
    projection coefficients h = V[:, :j+1]^H @ u directly into the
    Hessenberg column slice H[:j+1, j] via a single gemv with the
    output pointing at that slice -- no intermediate h buffer, no
    slice-assign copy (audit item IntelPython#16). Pass 2 fuses the AXPY-style
    update u = -V @ h + 1 * u into one gemv with alpha=-1, beta=1 --
    no tmp buffer, one kernel instead of gemv-plus-subtract (audit
    item IntelPython#19). For complex V the (j+1)-element h slice is conjugated
    in-place between the two passes (V^T -> V^H), negligible cost
    next to the n*(j+1) gemv.
  - Switch the Hessenberg least-squares H y = e from a device
    dpnp.linalg.lstsq (which dispatches an SVD kernel for a tiny
    21x20 problem per restart) to numpy.linalg.lstsq on the host
    (audit item IntelPython#17). Matches CuPy's choice and removes a device-
    side SVD launch that on Intel GPU dominates the restart cost
    for the default restart=20. RHS e is now maintained as a numpy
    array; H is copied via dpnp.asnumpy once per restart and the
    resulting y is shipped back as a dpnp array for the V @ y
    update.
  - V[:, j+1] = v retained as a single contiguous USM slice store
    (audit item IntelPython#15 closes as no-change-required: the assignment is
    already one dpnp op on an F-order buffer and there is no fused
    'normalise-then-store' path without further binding work).
  - cg per-iter syncs collapsed from 3-4 down to 1 (audit item IntelPython#9).
    The pAp and rz_new breakdown checks are no longer transferred
    to the host on every iteration; instead the loop relies on
    IEEE-754 inf / NaN propagation through alpha = rz / pAp. When
    pAp underflows the resulting alpha is inf or NaN, poisons the
    next residual via x += alpha * p and r -= alpha * Ap, and the
    single norm sync at the top of the next iteration detects the
    breakdown via numpy.isfinite(rnorm_host) and exits with
    info > 0. Mirrors the cuBLAS-style CG inner loop (nrm2 + scalar
    test, one host barrier per iter); the initial rz breakdown
    guard remains so a zero preconditioned residual still short-
    circuits correctly.

tests/third_party/cupyx/scipy_tests/sparse_tests/test_linalg.py
  - test_gmres_complex_arnoldi_fast_path: complex-dtype regression
    guard for the conjugate-in-place branch of _make_compute_hu --
    a silent miss of the conjugate would lose orthogonality and
    misconverge.
  - test_cg_inf_breakdown_returns_positive_info: regression guard
    for the per-iter-sync collapse. Runs cg on a deliberately
    singular SPD operator and asserts info > 0 (not zero, not -1)
    so a future re-introduction of explicit breakdown syncs would
    still pass but a regression to the old info contract would not.
… __del__ against shutdown races

Closes audit items IntelPython#5 and IntelPython#29 from the prior solver review. Item IntelPython#4
(_matmat default uses a per-column matvec loop) is closed as wontfix:
SciPy's scipy.sparse.linalg.LinearOperator and cupyx's analogue both
ship the same hstack-of-matvecs default, so dpnp matches the
reference exactly and there is no portable improvement to make
without subclass-level _matmat overrides (which _CustomLinearOperator
already exposes via its matmat= constructor argument).

scipy/sparse/linalg/_interface.py
  - Set __array_ufunc__ = None on the LinearOperator base class.
    This is the SciPy contract: a host numpy.ndarray on the left of
    np_array * linop or np_array @ linop previously triggered
    NumPy's ufunc dispatch first, which would attempt to broadcast
    the operator element-wise before falling back to its reflected
    operator method -- producing either an opaque error or a wrong-
    typed result. With __array_ufunc__ = None NumPy returns
    NotImplemented from the ufunc protocol and Python's operator
    dispatch falls through cleanly to LinearOperator.__rmul__ /
    __rmatmul__. dpnp.ndarray itself sets __array_ufunc__ = None
    (see dpnp/dpnp_array.py:222) for the same reason, so the two
    dispatch systems now agree.

scipy/sparse/_csr.py, scipy/sparse/linalg/_iterative.py
  - Harden __del__ in csr_matrix and in _CachedSpMV against the
    interpreter-shutdown race where the compiled _sparse_impl
    extension is garbage-collected before the matrix instance whose
    oneMKL handle it owns. Previous code used a single
    except Exception: pass which silenced two qualitatively
    different failure modes:
      1. shutdown race -- extension gone, si._sparse_gemv_release
         evaluates to None or AttributeError; the handle is
         unrecoverable and leaving the OS to reclaim it at process
         exit is the only sane option;
      2. genuine backend error while the interpreter is healthy --
         a real bug we want to surface eventually, but raising from
         __del__ produces only an 'Exception ignored in:' warning
         and the handle is gone either way.

    The new code probes getattr(si, '_sparse_gemv_release', None)
    explicitly so case (1) takes the fast non-call path, and then
    splits the except into (AttributeError, TypeError) for case (1)-
    style residuals (queue / handle attribute access racing the
    shutdown) versus a final broad except for case (2). Both still
    return silently from __del__ -- raising is never valid here --
    but the intent is now documented and a real backend regression
    is no longer indistinguishable from the GC race in code review.

tests/third_party/cupyx/scipy_tests/sparse_tests/test_linalg.py
  - test_array_ufunc_opt_out: asserts the __array_ufunc__ = None
    marker is present on LinearOperator. Mirrors SciPy's own test
    suite test_interface.py::test_array_ufunc_opt_out.
  - test_numpy_scalar_times_linop_dispatches_to_rmul: the concrete
    runtime consequence -- numpy.float64(2.0) * linop must
    produce a scaled LinearOperator, not raise or yield an array.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Request support for scipy.sparse.linalg LinearOperator, GMRES, and MINRES

5 participants