You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
User Story:
As a data engineer
I want to set up an internal batch scoring of addresses
so that I can process large datasets efficiently for the data team and provide results
Acceptance Criteria:
GIVEN the current system of batch processing
WHEN the bathc processing runs and there are errors
THEN there is an easy way to reprocess those erorrs quickly and efficiently and provide the consolidated results
The batch process is working well and it's pretty quickly to pull results (400k in ~24-30hrs) however, it's an inconvenience for the Engineers & Data Scientists to have to manually pull & re-run the errors.
Update the DB connection to switch from dedicated connection to a data pool
Can we add the following once a batch process has run
New UI in Django (see error messages, retry, re-run batch)
Keep all records in the DB (store each address in DB)
Automatically ‘re-run’ after the batch completion and only the error records are re-processed (e.g if a single network has failed on that network will be re-run
Download the finished consolidated file with all wallets with a click of a button
Product & Design Links:
Tech Details:
Open Questions:
Notes/Assumptions:
The text was updated successfully, but these errors were encountered:
User Story:
As a data engineer
I want to set up an internal batch scoring of addresses
so that I can process large datasets efficiently for the data team and provide results
Acceptance Criteria:
GIVEN the current system of batch processing
WHEN the bathc processing runs and there are errors
THEN there is an easy way to reprocess those erorrs quickly and efficiently and provide the consolidated results
This is a continuation of #3184
The batch process is working well and it's pretty quickly to pull results (400k in ~24-30hrs) however, it's an inconvenience for the Engineers & Data Scientists to have to manually pull & re-run the errors.
Update the DB connection to switch from dedicated connection to a data pool
Can we add the following once a batch process has run
Product & Design Links:
Tech Details:
Open Questions:
Notes/Assumptions:
The text was updated successfully, but these errors were encountered: