Tasks taking long to get executed #36798
Replies: 5 comments 10 replies
-
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval. |
Beta Was this translation helpful? Give feedback.
-
As it can be seen from some steps to other it takes a couple of seconds (not referring between tasks), I mean to the logs from a given task only |
Beta Was this translation helpful? Give feedback.
-
This is not airflow issue - it's a discussion about your executor. Can you please provide more explanation about your investigation and extract the pieces of logs that are relevant in an organized way? We have no time to help you debug your executor and worker logs, but I think the discussion would be interesting if you spend a little time explaining your problem with extracting only the relevant pieces of the logs - looking at the log dump here it is not clear what they are about -especially that people here do not know anything about your executor, but you increase chances that someone will take a look at it if rather than dyumping logs, explain your line of thoughts by explaining details about the investigtation you've done, snippets of the code (small - relevant to the conclusions you made and where you see the problems. |
Beta Was this translation helpful? Give feedback.
-
The logs I provided before are from the
it can be seen jumps of a few seconds which makes for small tasks (like the ones at the example_complex) a big amount of time. I believe that the problem is not from the scheduler, as tasks arrive very quickly to the worker, the problem is that the actual command Executing from a worker:
Each time a task is scheduled it gets sent to a kafka topic and then a worker receives the task, messages about which tasks should be executed look like this:
So for each task a command like this needs to be executed |
Beta Was this translation helpful? Give feedback.
-
We are using celery executor and plugins are loaded every time a task executes but actual task execution doesnot need plugins and we are using custom timetables and dag run notify plugins .Is there a way we can disable airflow celery worker of loading plugins every time they execute a task and also want to know why it is taking different time while loading plugins each time We left lazy_load_plugins to default true value so that they are loaded only when required
|
Beta Was this translation helpful? Give feedback.
-
Apache Airflow version
2.8.0
If "Other Airflow 2 version" selected, which one?
No response
What happened?
Tasks are taking to long to get executed, workers receive the order to execute a task but once the workers execute the command, it takes several seconds for the actual task to get executed. For example, example_complex dag is taking between a minute and a half and two minutes to complete, but the actual task execution time for each of the tasks is less than a second.
I have custom workers that receive tasks from kafka and execute them.
I'll append a few logs:
What you think should happen instead?
Tasks take less than a second to get actually executed, but the total amount of time executing the dag is much bigger than expected.
How to reproduce
I just try to execute the example_complex dag although the described behaviour is the same for every other dag.
Operating System
Debian GNU/Linux 12 (bookworm)
Versions of Apache Airflow Providers
pip freeze | grep providers
apache-airflow-providers-amazon==8.16.0
apache-airflow-providers-celery==3.5.0
apache-airflow-providers-cncf-kubernetes==7.11.0
apache-airflow-providers-common-io==1.1.0
apache-airflow-providers-common-sql==1.9.0
apache-airflow-providers-docker==3.8.2
apache-airflow-providers-elasticsearch==5.3.0
apache-airflow-providers-ftp==3.7.0
apache-airflow-providers-google==10.12.0
apache-airflow-providers-grpc==3.4.0
apache-airflow-providers-hashicorp==3.6.0
apache-airflow-providers-http==4.8.0
apache-airflow-providers-imap==3.5.0
apache-airflow-providers-microsoft-azure==8.4.0
apache-airflow-providers-mysql==5.5.0
apache-airflow-providers-odbc==4.2.0
apache-airflow-providers-openlineage==1.3.0
apache-airflow-providers-postgres==5.9.0
apache-airflow-providers-redis==3.5.0
apache-airflow-providers-sendgrid==3.4.0
apache-airflow-providers-sftp==4.8.0
apache-airflow-providers-slack==8.5.0
apache-airflow-providers-snowflake==5.2.0
apache-airflow-providers-sqlite==3.6.0
apache-airflow-providers-ssh==3.9.0
Deployment
Official Apache Airflow Helm Chart
Deployment details
My deployment is pretty similar to the one using docker compose but a big difference is that I have a Kafka executor and workers, for discarding I made several tests to ensure that they are not introducing unexpected delays.
Anything else?
No response
Are you willing to submit PR?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions