-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New treatment of relational information in Flambda 2 types #3219
Comments
In light of the discussions related to the relations and the realization that this could also be useful to make CSE more robust, I have done some more reseach on the topic and I think that I have a more promising approach than my prototype. This approach takes inspiration from egglog, which combines the best of congruence closure and deductive databases. This has somewhat grown in scope compared to the initial goal we set with @lthls to extract In addition, the approach and underlying data structures can be extended (in a second step) to also provide a principled, and generic implementation of env extensions that would remove the need of threading support for extensions at specific places as was done in #1324. DetailsRoughly, the idea would be to implement a proper deductive database in the We want to store a combination of:
I mention The database is kept canonical using a rebuilding procedure similar to what is New facts (such as Queries, such as finding the set of names Inter-reductions (e.g. when we learn The elephant in the room is: how do we join two such environments? Logic Ignore for a moment the issue of equality and the union find, i.e. assume that Very roughly, Leapfrog Triejoin looks at variables sequentially and computes When joining in this way, we only consider the left-hand side of rules; if we If we take into account the union-find and the fact that each environment can
However, we can leverage sharing between equivalence classes to build compact
Iterating over all the entries in the database still has complexity The practical complexity is further driven down by the fact that the leapfrog AdvantagesThis approach can be seen as an extension of the current representation, where The database approach thus streamlines the existing implementation, and makes This approach also opens up exciting (to me) possibilities by simply encoding
where the extensions stored on the right-hand side of the DrawbacksWe need a logic database in the compiler. This is not as big a drawback as it Due to the genericity of the approach, there will probably be a slightly higher We do need big-endian patricia trees to implement the leapfrog triejoin Implementation planI plan to implement this in a gradual way, in order to minimize the possibility
At this point, we would have a more robust version of the existing system where The last two steps are:
Footnotes
|
@bclement-ocp That may be a dumb question but with this new database, including the rules, would it be possible to implement some kind of peephole optimizations (see the examples in #2188 ), or does the canonicalization/rebuilding of the database prevents that in some way (or would that have prohibitive cost) ? |
@Gbury there are no dumb questions. I did not think about this use case too much, but one cool thing about a database is indeed that you can do queries! The short answer is yes, and from my understanding the database would have most of the pieces needed to build the "efficient and incremental pattern matching engine" mentioned in #2188. The longer answer is that there are two ways you could do peephole optimizations with this:
This rule would automatically add the additional equality The second approach is closer to equality saturation and what egglog does1, it is more complete but also obviously more expensive if applied aggressively. My intuition is that if we are reasonable with the rewrite rules we want to consider, this should scale reasonably well by exploiting some of the incremental tricks from egglog. Footnotes
|
Some questions that I think are worth considering before embarking on what sounds like a decent amount of work, and could feasibly result in making flambda2's compile times worse:
|
To be clear, those questions are really for the flambda 2 team as a whole, not @bclement-ocp specifically. |
I'd say maintaining information across join points is quite important for at least two optimizations:
As for the time spent in the join algorithm, I don't know, maybe @lthls or @chambart might have better insights. That being said, I think it would be pertinent to see if we could measure that on practical experiments. |
There are several kinds of relations stored in the typing env. Aliases are crucial, but I assume that you were not thinking about that. Projection relations are fairly important, but we don't plan to change how they're represented at the moment.
We don't have numbers for how much the new relations would help. For conditions, the main use would be to remove array bound checks, but I assume that all performance-critical code were the bound checks could be removed by the compiler already use the unsafe primitives; for CSE, it's hard to know how many CSE opportunities we miss without implementing the better version.
If we had a traditional case instead of our current version, we probably wouldn't need the
@Gbury has more data than me about the need for join, but I will note that our initial design did not activate join points by default, only starting at |
First off, I have done some quick and unscientific experiments to see how much time we spend joining environments. We don't really have a good infrastructure to answer this question, but looking at a few arbitrary files from the stdlib and compiler, it looks to be somewhere between 5% and 30% of compile times. This is again an unscientific experiment done on a laptop and on a small random sample of files, but 30% is too high, and higher than we collectively expected. I have opened a PR ( #3298 ) to track join times with Regarding @lpw25's other points, in addition to what @Gbury and @lthls said, I want to clarify a couple of things, notably regarding performance. The implementation plan suggested above is in two parts. The first part is to rework the join algorithm using the egglog-inspired approach, but without (significantly) changing the way information is currently stored and accessed. This should have no impact outside of The second part is to allow more flexibility in the way that information is stored in the typing env. I have thought quite a bit about the performance implications and, for information that we currently store elsewhere and would move to the database (makes the code simpler, makes the information preservation more robust to join), I believe this should have minimal impact on compile times. On the other hand, this would provide a lot of flexibility and grounds for experimenting with new optimizations at a low implementation cost (see e.g. discussion about peephole optimizations). A more relational representation of information would also make it easier to tune the information that we actually want to preserve during joins, which opens up some possibilities to tune the compile times. Why I think the egglog-inspired join would be fasterThe current join algorithm is a bit of a mess, and was not written with aliases in mind. It tries to maintain aliases in a sort of best-effort way, which involve casting a wide net of variables that could be useful in the resulting environment, computing join on these variables, then filtering out the result with the variables that are actually useful. It depends on a consistent order on variables, which in practice does not exist, and also depends on the name of existential variables. This is not very efficient, fragile, and in practice also can miss some important information that we would want in the resulting environment, prompting more fragile hacks. On the other hand, the egglog-inspired join algorithm deals with aliases and typing information separately, and directly tracks which variables are relevant. The output does not depend on variable definition order nor on the names of existential variables, and there is no need for a post-filtering pass. For these reasons alone I think that this would improve join performance, but the egglog-inspired join algorithm is also able to use an actual n-way join on variables (rather than a sequence of pairwise joins), which should also be faster (potentially much faster) for large matches. |
FWIW I don't think the introduction of artificial join points is fundamentally required for implementing match-in-match, but I see how you've ended up in that situation. Have we attempted to measure the cost of continuation lifting on compilation times.
It depends which aliases you mean. Aliases to in-scope variables are the mechanism by which we do beta-reduction, but aliases via existentially bound logical variables seems much more expensive to manage, and the benefit is much less clear.
Could you say more about why they are important? Is that just for CSE or also for something else?
Whilst it is neat that you get that optimisation for free, I don't think I'd trade it for a more complex and probably noticeably slower optimizer. It appears to me that the current approach to sums and match statements has consumed a very large fraction of the effort that has gone into flambda 2, and my expectation is that it will continue to do so, so I do think it is valuable to spend time considering if it is worth it.
I think you make a compelling case for why -- if you take the information we currently track as fixed -- your approach is a good one and will improve the status quo. I just want to encourage people to think more carefully about whether we really should be tracking the information we currently track, and in particular to do more to measure the cost of tracking that information and to do more to measure the benefit of tracking that information in practice. |
Yes we did measure the cost of lifting. While on most files the cost was negligible (I suspect partly because there was not that much situations where it triggered), one file from the compiler testsuite ( On Finally, note that we could indeed perform math-in-match without lifting, but it's not that easy to estimate beforehand the cost of the potentially duplicated handlers; for that we would likely need some kind of speculative continuation specialization (akin to speculative function inlining), but from discussions I had, people did not seem keen on having that, thus why we've tried to have a more deterministic approach to estimate when to do match-in-match, with lifting. |
Although this would likely change with @bclement-ocp's work, currently I'm assuming that keeping existentially bound logical variables is cheaper than not doing so, because it makes scope changes basically free (no need to rewrite the whole environment to remove occurrences of variables that are no longer in scope). At the moment we don't introduce existentially bound variables except to replace formerly in-scope ones, so the number of variables stays reasonable. Also, when specialising functions (which occurs when simplifying the set of closures), the contents of the closure is not in scope in the function's body, but having existentially bound variables means that we can still track that precisely.
We use relations in the typing env in place of CSE for projections, yes (because our current CSE engine is not good at following aliases, this gives noticeably better results). |
No description provided.
The text was updated successfully, but these errors were encountered: