Sparse+dense mixed arrays #43

mrocklin · 2017-04-29T13:33:39Z

Previously admm would rechunk the columns to be in a single chunk, and
then pass delayed numpy arrays to the local_update function. If the
chunks along columns were of different types, like a numpy array and a
sparse array, then these would be inefficiently coerced to a single
type.

Now we pass a list of numpy arrays to the local_update function. If
this list has more than one element then we construct a local dask.array
so that operations like dot do the right thing and call two different
local dot functions, one for each type.

This currently depends on dask/dask#2272 though I may be able to avoid this dependency.

Previously admm would rechunk the columns to be in a single chunk, and then pass delayed numpy arrays to the local_update function. If the chunks along columns were of different types, like a numpy array and a sparse array, then these would be inefficiently coerced to a single type. Now we pass a list of numpy arrays to the local_update function. If this list has more than one element then we construct a local dask.array so that operations like dot do the right thing and call two different local dot functions, one for each type.

mrocklin · 2017-05-01T21:11:20Z

There is a non-trivial cost to using dask.array within the function given to scipy.optimize. Graph generation costs appear to be non-trivial. I can reduce these somewhat, but I'm also curious if it is possible to apply f and fprime individually to all chunks of the input to local_update. In this case each chunk correspond to a block of columns. Looking at the methods in families.py it looks like it might be possible to evaluate f on each block and add them together and to evaluate fprime on each block and concatenate them.

@moody-marlin is this generally possible?

(please let me know if this description was not clear)

cicdw · 2017-05-02T14:41:05Z

Hmmm I'm not sure how possible this is; it definitely won't be as easy as evaluating f and fprime on each block and combining them though; for example,

(( y - X1.dot(beta) - X2.dot(beta)) ** 2).sum()

doesn't split as a simple sum on each chunk.

There might be fancier ways of combining the results, or possibly even altering ADMM to take this into account, but it will require some non-trivial thinking.

mrocklin · 2017-05-02T14:46:50Z

There might be fancier ways of combining the results

The fancy way here is already handled by dask.array. I was just hoping to avoid having to recreate graphs every time. I can probably be clever here though. I'll give it a shot

This create a dask graph for the local_update computation once and then hands the solver a function that just puts in a new beta and evaluates with the single threaded scheduler. Currently this fails to converge.

mrocklin · 2017-05-02T16:05:12Z

OK, I've pushed a solution that, I think, avoids most graph construction costs. However my algorithm is failing to converge. @moody-marlin if you find yourself with a free 15 minutes can you take a look at def local_update() and see if I'm doing anything wonky? I suspect that there is something dumb going on. When I print out the result and gradients I find that it doesn't seem to converge (gradients stay large).

cicdw · 2017-05-02T18:38:00Z

dask_glm/algorithms.py

+            print(result, gradient)
+            return result, gradient
+
+    beta, _, _ = solver(f2, beta, args=solver_args, maxiter=200, maxfun=250)


^^^ I think this line is incorrect; solver (in this case fmin_l_bfgs_b) has this call signature; the first argument is the function f, and a keyword argument needs to be fprime.

I thought that I didn't have to specify fprime if f returned two results

Oh, it looks like you're right; sorry, I've never called it that way. Hmm back to the drawing board.

cicdw · 2017-05-02T20:20:43Z

dask_glm/algorithms.py

+            print(result, gradient)
+            return result, gradient
+
+        solver_args = ()


Why no solver_args in this case?

They are all, I think, in the task graph. My assumption is that these will not change during the call to local_update. Is this correct?

Yea that's correct.

mrocklin added 2 commits April 29, 2017 09:32

simplify local_update array handling

49c2f56

mrocklin mentioned this pull request May 1, 2017

ENH: Allow add_intercept for unknown dims #45

Open

unite f and fprime computation

3137811

Construct graph once, evaluate many times

0164b3f

This create a dask graph for the local_update computation once and then hands the solver a function that just puts in a new beta and evaluates with the single threaded scheduler. Currently this fails to converge.

cicdw reviewed May 2, 2017

View reviewed changes

Base automatically changed from master to main February 10, 2021 01:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse+dense mixed arrays #43

Sparse+dense mixed arrays #43

mrocklin commented Apr 29, 2017

mrocklin commented May 1, 2017

cicdw commented May 2, 2017

mrocklin commented May 2, 2017

mrocklin commented May 2, 2017

cicdw May 2, 2017

mrocklin May 2, 2017

cicdw May 2, 2017

cicdw May 2, 2017

mrocklin May 2, 2017

cicdw May 2, 2017

Sparse+dense mixed arrays #43

Are you sure you want to change the base?

Sparse+dense mixed arrays #43

Conversation

mrocklin commented Apr 29, 2017

mrocklin commented May 1, 2017

cicdw commented May 2, 2017

mrocklin commented May 2, 2017

mrocklin commented May 2, 2017

cicdw May 2, 2017

Choose a reason for hiding this comment

mrocklin May 2, 2017

Choose a reason for hiding this comment

cicdw May 2, 2017

Choose a reason for hiding this comment

cicdw May 2, 2017

Choose a reason for hiding this comment

mrocklin May 2, 2017

Choose a reason for hiding this comment

cicdw May 2, 2017

Choose a reason for hiding this comment