Matrix multiplication inconsistent for dask and sparse: `dask @ sparse` works, `sparse @ dask` fails #9934

brendan-m-murphy · 2025-01-09T11:39:52Z

What is your issue?

The order of matrix multiplication matters for dask and sparse arrays, but probably shouldn't.

Here is an example:

import numpy as np
from sparse import COO
import xarray as xr

sparse_da = xr.DataArray(COO.from_numpy(np.arange(10)))
dask_da = xr.DataArray(np.arange(100).reshape(10, 10).chunk({"dim_1": 5})

# this works:
dask_da @ sparse_da

# this raises a TypeError for "unsupported types"
sparse_da @ dask_da

# this works as expected
sparse_da @ dask_da.as_numpy()

In the workflow where this is used, the dask array has no chunks along its common dimensions with the sparse array, so it seems like sparse @ chunk should be fine. Also, in this workflow, loading the dask array into memory or making the sparse array dense would use a very large amount of memory.

The text was updated successfully, but these errors were encountered:

keewis · 2025-01-09T14:03:26Z

this is a side-effect of using opt_einsum: while

sparse_da @ dask_da

fails, this:

with xr.set_options(use_opt_einsum=False):
    sparse_da @ dask_da

works (but is slower). You could also avoid this issue by converting sparse_da to a single-chunk dask array using .chunk().

(not sure what to do to fix this in xarray, but I agree that this is annoying)

brendan-m-murphy · 2025-01-09T14:54:23Z

Thanks for the suggestions!

This is the ufunc we've been using instead: https://github.com/openghg/openghg_inversions/blob/sparse-xarray-fix/openghg_inversions/array_ops.py#L75. opt_einsum ends up calling sparse.tensordot, so I made this ufunc just call sparse.tensordot immediately. There's a bit of a "gotcha" where sparse.tensordot leaves extra broadcast dimensions around, so they need to be removed.

I might just rewrite it to apply @ in the opposite order, or use your chunking suggestion. (Or remove the function and just inline this code.)

Maybe xarray could check for the case of sparse and dask and either swap the order or convert the sparse array into a single chunk dask array? I guess opt-einsum could probably fix this too, but it seems like they just check if sparse has a tensordot attribute and apply it, so they would also need to insert some ugly logic to check both operands and decide what tensordot to use.

Used chunking suggestion from pydata/xarray#9934

dcherian · 2025-01-09T16:23:17Z

Can you open an issue at pydata/sparse instead?

brendan-m-murphy · 2025-01-09T17:07:49Z

Can you open an issue at pydata/sparse instead?

Sure, I can see what they say. I think they would need to change tensordot to accept dask arrays (or any duck array?). At least in this case, the non-sparse array just needs to support indexing (I think).

brendan-m-murphy added the needs triage Issue that has not been reviewed by xarray team member label Jan 9, 2025

brendan-m-murphy mentioned this issue Jan 9, 2025

Duck array ops try to import transpose from sparse #9933

Open

5 tasks

brendan-m-murphy mentioned this issue Jan 9, 2025

Updated sparse_xr_dot to avoid error from upstream changes openghg/openghg_inversions#231

Merged

1 task

brendan-m-murphy added a commit to openghg/openghg_inversions that referenced this issue Jan 9, 2025

Simplified sparse_xr_dot

8c415e2

Used chunking suggestion from pydata/xarray#9934

dcherian added upstream issue and removed needs triage Issue that has not been reviewed by xarray team member labels Jan 9, 2025

brendan-m-murphy mentioned this issue Jan 9, 2025

Enh: Accept "duck arrays" for tensordot pydata/sparse#833

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Matrix multiplication inconsistent for dask and sparse: `dask @ sparse` works, `sparse @ dask` fails #9934

Matrix multiplication inconsistent for dask and sparse: `dask @ sparse` works, `sparse @ dask` fails #9934

brendan-m-murphy commented Jan 9, 2025 •

edited

Loading

keewis commented Jan 9, 2025 •

edited

Loading

brendan-m-murphy commented Jan 9, 2025

dcherian commented Jan 9, 2025

brendan-m-murphy commented Jan 9, 2025

Matrix multiplication inconsistent for dask and sparse: dask @ sparse works, sparse @ dask fails #9934

Matrix multiplication inconsistent for dask and sparse: dask @ sparse works, sparse @ dask fails #9934

Comments

brendan-m-murphy commented Jan 9, 2025 • edited Loading

What is your issue?

keewis commented Jan 9, 2025 • edited Loading

brendan-m-murphy commented Jan 9, 2025

dcherian commented Jan 9, 2025

brendan-m-murphy commented Jan 9, 2025

Matrix multiplication inconsistent for dask and sparse: `dask @ sparse` works, `sparse @ dask` fails #9934

Matrix multiplication inconsistent for dask and sparse: `dask @ sparse` works, `sparse @ dask` fails #9934

brendan-m-murphy commented Jan 9, 2025 •

edited

Loading

keewis commented Jan 9, 2025 •

edited

Loading