airflow

I want to run an Airflow dag and watch the logs in the terminal.

Trouble is, each time a task is run a new directory and file is created. Something like:

~/airflow/logs/my-dag/my-task/2018-03-06T09:59:10.427477/1.log

This makes it hard to tail-follow the logs. Thankfully, starting from Airflow 1.9, logging can be configured easily, allowing you to put all of a dag’s logs into one file.

If you make this change, you won’t be able to view task logs in the web UI, because the UI expects log filenames to be in the normal format.

Logging to a single file is useful for development (using the SequentialExecutor), but it won’t work in production because issues will arise when multiple tasks attempt to write to the same log file at once.

Easy Solution

Requires Airflow 1.10+

Set the FILENAME_TEMPLATE setting.

export AIRFLOW__CORE__LOG_FILENAME_TEMPLATE="{{ ti.dag_id }}.log"

Requires Airflow 1.9+

Since Airflow 1.9, logging is configured pythonically.

Grab Airflow’s default log config, airflow_local_settings.py, and copy it somewhere in your PYTHONPATH.

curl -O https://raw.githubusercontent.com/apache/incubator-airflow/master/airflow/config_templates/airflow_local_settings.py
cp airflow_local_settings.py $AIRFLOW__CORE__DAGS_FOLDER

Set the logging_config_class setting. (Make sure this is set in both your scheduler and worker’s environments). (Alternatively set the related setting in airflow.cfg.)

export AIRFLOW__CORE__LOGGING_CONFIG_CLASS=airflow_local_settings.DEFAULT_LOGGING_CONFIG

Now you can configure logging to your liking.

Edit airflow_local_settings.py, changing FILENAME_TEMPLATE to:

FILENAME_TEMPLATE = '{{ ti.dag_id }}.log'

You should now get all of a dag log output in a single file.

Tailing the logs

Start the scheduler and trigger a dag.

$ airflow scheduler
$ airflow trigger_dag my-dag

Watch the output with tail -f.

$ tail -f ~/airflow/logs/my-dag.log