-
Notifications
You must be signed in to change notification settings - Fork 731
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stability of ATE estimates #931
Comments
I don't immediately see anything wrong; I'm not super familiar with DataBricks, but I wonder if maybe they don't guarantee that rows are returned in the same order, or if it's possible that additional rows are added over time? |
Thanks for response. I sorted the dataframe to ensure the order remains consistent before running DML or KernelDML.
I am not sure what you mean by additional rows are added over time? Can you clarify. One thing I observed is |
That behavior is very strange: the nuisance scores should always be a nested list where the length of the outer list is |
Thanks. the difference in lengths was due to cancelling the job run before full finish. When I let it fully finish then the lengths are consistent. Another thing i observed, when i passed custom folds in cv argument, the results for
These are results for
I tested the same with cv=5 argument, in that case results are different.
Results for
As can be seen, results vary across inner as well as outer list elements. Question: How are these different 5 folds created across different I observed some difference in outputs of
Is there anything wrong in the logic being used? How to achieve stability in ATE estimates? |
I have a situation where I am getting different ATE estimates with same input dataset and same random seed. If I run today the average ATE number is around 1. If I run after few hours, it increases to 4 or even more. This is for the same treatment (T1) and control value (T0) combination. What could be potentially wrong here? I have one confounder, one treatment and one outcome column. All are continuous. I tried manually passing folds in cv argument but still the stability of estimates is not there. I have tried passing input dataset in a specific order, but again results are not the same. I observed this with both DoubleML and Kernel DML. What should be changed here to get more stable ATE estimates?
'common_causes2' contains one continuous variable.
For DoubleML using the following final model.
The average value differs a lot. Although there is not much variation in "results". E.g. sometimes I get "results" in the range from 1 to 3. Other times it increases to 8 to 9. that drives the "average_result" value to differ significantly among different runs. This is for the same treatment (T1) and control value (T0) combination. e.g. at one instance with T0=20, T1=25, average value shows 2, while after few hours with the same T0 and T1 values of 20 and 25 respectively, it shows the average value of 10.
I am running this on databricks cluster.
Is there anything wrong in the arguments specified above?
econml-0.15.1
The text was updated successfully, but these errors were encountered: