Replies: 5 comments 1 reply
-
There was never any serious thought around this issue, because "typically" it didn't cause issues: e.g. for IDE support, you would run
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
I think we may want to differentiate between two concurrency scenarios.
All instances we currently witnessed and linked are examples for scenario 1. This is great, as it can be much easier solved than 2. A simple lock should be sufficient to avoid a second parallel evaluation of the same task. We just need to wait until the first evaluation is finished and then re-use it's result (after some checks) for all waiting parallel evaluations as well. Essentially, we solve this by synchronizing the task evaluation. I think we can completely ignore the downstream consumers, as those are waiting for the result anyways and we most likely won't change the result once it's available. Having a solution for scenario 2. is harder to achieve but it's also less important. We may always run into situations where changes to task sources have the ability to break things (e.g. removing a source file which we try to compile at the same time), so it's probably ok to keep these potential issues a while longer but tackle the apparent and disruptive concurrency issues first. |
Beta Was this translation helpful? Give feedback.
-
Yeah a simpler per-target lock sounds like a good first step |
Beta Was this translation helpful? Give feedback.
-
A first implementation doing the in-process and in-memory task synchronization is ready for review. |
Beta Was this translation helpful? Give feedback.
-
I wonder if their is no special handling for task concurrency in Mill just because it was never an issue, or because it was considered as too slow? I almost never experienced any such issue in my daily work with Mill, but some users already had those.
These are the potential scenarios
One single Mill process, there is no need for any special concurrency management, even when Mill is executed with parallel task processing, there is no risk, as the dependency graph ensures each target is only evaluated once.
One single Mill process processing more than one request at a time, e.g. as BSP server. It can happen, if two requests cover the same transitive dependency targets, that a targets will run more that once at the same time. This can result is all kind of typical concurrency issue, e.g. A missing or already existing
T.dest
folder. We could easily add some in-memory lock table to never evaluate a target more than once at a time. The latest documented case is BSP: tasks seem to be executed duplicately and concurrently #2818.Two Mill processes can evaluate the same target. E.g. we run Mill on two different terminals. The potential issues and effects are the same as in point 2., but we can't apply some cheap in-process locks to mitigate. We probably need to use some filesystem based lock mechanism, which may slow down the evaluation process.
cc @lihaoyi @lolgab
Beta Was this translation helpful? Give feedback.
All reactions