-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse+dense mixed arrays #43
base: main
Are you sure you want to change the base?
Conversation
Previously admm would rechunk the columns to be in a single chunk, and then pass delayed numpy arrays to the local_update function. If the chunks along columns were of different types, like a numpy array and a sparse array, then these would be inefficiently coerced to a single type. Now we pass a list of numpy arrays to the local_update function. If this list has more than one element then we construct a local dask.array so that operations like dot do the right thing and call two different local dot functions, one for each type.
There is a non-trivial cost to using dask.array within the function given to @moody-marlin is this generally possible? (please let me know if this description was not clear) |
Hmmm I'm not sure how possible this is; it definitely won't be as easy as evaluating
doesn't split as a simple sum on each chunk. There might be fancier ways of combining the results, or possibly even altering ADMM to take this into account, but it will require some non-trivial thinking. |
The fancy way here is already handled by dask.array. I was just hoping to avoid having to recreate graphs every time. I can probably be clever here though. I'll give it a shot |
This create a dask graph for the local_update computation once and then hands the solver a function that just puts in a new beta and evaluates with the single threaded scheduler. Currently this fails to converge.
OK, I've pushed a solution that, I think, avoids most graph construction costs. However my algorithm is failing to converge. @moody-marlin if you find yourself with a free 15 minutes can you take a look at |
print(result, gradient) | ||
return result, gradient | ||
|
||
beta, _, _ = solver(f2, beta, args=solver_args, maxiter=200, maxfun=250) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^^^ I think this line is incorrect; solver (in this case fmin_l_bfgs_b
) has this call signature; the first argument is the function f
, and a keyword argument needs to be fprime
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that I didn't have to specify fprime if f returned two results
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it looks like you're right; sorry, I've never called it that way. Hmm back to the drawing board.
print(result, gradient) | ||
return result, gradient | ||
|
||
solver_args = () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why no solver_args
in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are all, I think, in the task graph. My assumption is that these will not change during the call to local_update. Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea that's correct.
Previously admm would rechunk the columns to be in a single chunk, and
then pass delayed numpy arrays to the local_update function. If the
chunks along columns were of different types, like a numpy array and a
sparse array, then these would be inefficiently coerced to a single
type.
Now we pass a list of numpy arrays to the local_update function. If
this list has more than one element then we construct a local dask.array
so that operations like dot do the right thing and call two different
local dot functions, one for each type.
This currently depends on dask/dask#2272 though I may be able to avoid this dependency.