Deploying Great Expectations with Google Cloud Composer (Hosted Airflow)

This article is for comments to: https://docs.greatexpectations.io/en/latest/guides/workflows_patterns/deployment_google_cloud_composer.html
Please comment +1 if this How-to Guide is important to you.

This guide is really useful, but there is something I couldn’t find yet and it is how to pass the (service_account) credentials for both GCS buckets and BigQuery using the airflow connections when instantiating the DataContextConfig, if that is possible at all.

Great Expectations has its own configuration of connections to GCS buckets, databases and other resources. As of now, we don’t support Airflow connections for this configuration, but will look into this and evaluate if this is helpful and feasible.

1 Like

For Great Expectations Stores, Great Expectations uses the GOOGLE_APPLICATION_CREDENTIALS env variable. This variable should point to your GCP credentials json file. Inside Cloud Composer this variables is already set and you don’t have to worry about it.

For Bigquery Credentials it’s different. You can provide in the connection_string the credentials path. For example, bigquery://<your-project>/<your-dataset>?credentials_path=<path-to-credentials>
That way you can tell Great Expectations where to fetch your credentials from. But I think that if you do not set this variable, GE will try to fetch your credentials from the same env variable I pointed before.