I have spent an entire day now on trying to figure out how to install GE in the Google Cloud so that I could run my GE data validation tasks via DAGs in the Cloud Composer. I found that recommended way of doing it would be via airflow-provider-great-expectations but this is where ‘clarity’ ends and I cannot figure out how to set up the rest of the process. I am going to ask questions in a bullet point format:
- I assume that airflow-provider-great-expectations needs to be installed as Pypi package of the Cloud Composer environment, is that corect?
- If the above is true, do I also need to install GE as another Pypi package? (I read that GE is installed as a dependency whenever airflow-provider-great-expectations is installed)
- I found that GE folders and files need to be made available to Airflow DAGs so how do I make this happen? Do I need to make another installation of GE (on top of already installed dependency to the airflow-provider-great-expectations) on my local machine and then manually transfer all the files and directories to Google Cloud Storage that is available to Airflow DAGs?