You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To my knowledge no one uses reduceby directly, its interface, while simple, is hard enough to scare away most non-experts.
The current interface accepts Python functions to split (often a get), and pushes the apply-combine steps into a single associative binary operator. This use of external functions is idiomatic for toolz but limits efficiency for cytoolz. How can we modify the API (or create an entirely new API) to hit this application with great efficiency?
A fast, intuitive, streaming split-apply-combine operation on core data structures would be a serious motivator for some.
The text was updated successfully, but these errors were encountered:
Reduceby is
pytoolz
' implementation to the split-apply-combine strategy, a very common data analysis pattern. It's equivalent to pandas'It's also equivalent to Julia's DataFrame's
by
operation http://juliastats.github.io/DataFrames.jl/split_apply_combine.htmlTo my knowledge no one uses
reduceby
directly, its interface, while simple, is hard enough to scare away most non-experts.The current interface accepts Python functions to split (often a
get
), and pushes the apply-combine steps into a single associative binary operator. This use of external functions is idiomatic fortoolz
but limits efficiency forcytoolz
. How can we modify the API (or create an entirely new API) to hit this application with great efficiency?A fast, intuitive, streaming split-apply-combine operation on core data structures would be a serious motivator for some.
The text was updated successfully, but these errors were encountered: