Airflow Integration

I hear lots of questions from users who are integrating Great Expectations into existing pipelines that use Airflow to manage scheduling and execution. As part of a recent meetup, I built a minimal working airflow example that demonstrates a simple deployment pattern of wrapping great expectations execution into a PythonOperator:

There is also an example of airflow integration in the primary repository here:

Please feel free to post airflow questions, challenges, successes, tips, or anecdotes here!

Here’s what’s on my airflow wishlist:

  • an operator that utilizes a ge DataContext to make loading of suites easy
  • an operator that makes validation easy, that ideally includes notification options
  • great docs for these operators