Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse+dense mixed arrays #43

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Conversation

mrocklin
Copy link
Member

Previously admm would rechunk the columns to be in a single chunk, and
then pass delayed numpy arrays to the local_update function. If the
chunks along columns were of different types, like a numpy array and a
sparse array, then these would be inefficiently coerced to a single
type.

Now we pass a list of numpy arrays to the local_update function. If
this list has more than one element then we construct a local dask.array
so that operations like dot do the right thing and call two different
local dot functions, one for each type.

This currently depends on dask/dask#2272 though I may be able to avoid this dependency.

Previously admm would rechunk the columns to be in a single chunk, and
then pass delayed numpy arrays to the local_update function.  If the
chunks along columns were of different types, like a numpy array and a
sparse array, then these would be inefficiently coerced to a single
type.

Now we pass a list of numpy arrays to the local_update function.  If
this list has more than one element then we construct a local dask.array
so that operations like dot do the right thing and call two different
local dot functions, one for each type.
@mrocklin
Copy link
Member Author

mrocklin commented May 1, 2017

There is a non-trivial cost to using dask.array within the function given to scipy.optimize. Graph generation costs appear to be non-trivial. I can reduce these somewhat, but I'm also curious if it is possible to apply f and fprime individually to all chunks of the input to local_update. In this case each chunk correspond to a block of columns. Looking at the methods in families.py it looks like it might be possible to evaluate f on each block and add them together and to evaluate fprime on each block and concatenate them.

@moody-marlin is this generally possible?

(please let me know if this description was not clear)

@cicdw
Copy link
Collaborator

cicdw commented May 2, 2017

Hmmm I'm not sure how possible this is; it definitely won't be as easy as evaluating f and fprime on each block and combining them though; for example,

(( y - X1.dot(beta) - X2.dot(beta)) ** 2).sum()

doesn't split as a simple sum on each chunk.

There might be fancier ways of combining the results, or possibly even altering ADMM to take this into account, but it will require some non-trivial thinking.

@mrocklin
Copy link
Member Author

mrocklin commented May 2, 2017

There might be fancier ways of combining the results

The fancy way here is already handled by dask.array. I was just hoping to avoid having to recreate graphs every time. I can probably be clever here though. I'll give it a shot

This create a dask graph for the local_update computation once and then
hands the solver a function that just puts in a new beta and evaluates
with the single threaded scheduler.

Currently this fails to converge.
@mrocklin
Copy link
Member Author

mrocklin commented May 2, 2017

OK, I've pushed a solution that, I think, avoids most graph construction costs. However my algorithm is failing to converge. @moody-marlin if you find yourself with a free 15 minutes can you take a look at def local_update() and see if I'm doing anything wonky? I suspect that there is something dumb going on. When I print out the result and gradients I find that it doesn't seem to converge (gradients stay large).

print(result, gradient)
return result, gradient

beta, _, _ = solver(f2, beta, args=solver_args, maxiter=200, maxfun=250)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^^ I think this line is incorrect; solver (in this case fmin_l_bfgs_b) has this call signature; the first argument is the function f, and a keyword argument needs to be fprime.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that I didn't have to specify fprime if f returned two results

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it looks like you're right; sorry, I've never called it that way. Hmm back to the drawing board.

print(result, gradient)
return result, gradient

solver_args = ()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no solver_args in this case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are all, I think, in the task graph. My assumption is that these will not change during the call to local_update. Is this correct?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea that's correct.

Base automatically changed from master to main February 10, 2021 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants