I’d like to be able to run Great Expectations at several steps in an airflow pipeline. Between which, Spark is being used used to cleanse / transform data and store parquet files in S3 buckets. Can Great Expectations access these as a data source? I can find only references to Spark on filesystem, and S3 in conjunction with Pandas.
Any help appreciated.