How to configure an EMR Spark Datasource

This article is for comments to: https://docs.greatexpectations.io/en/latest/how_to_guides/configuring_datasources/how_to_configure_an_emr_spark_datasource.html

Please comment +1 if this How to is important to you.

1 Like

+1
It seems using the S3GlobReaderBatchKwargsGenerator will translate and s3:// path into s3a:// which messes up with spark being able to open file within the EMRFS context.
I might be doing something wrong, having documentation will uncover if it’s a bug or not

+1
Just joined so I could +1 this! I’ve been playing with ge for a few weeks locally now. Good work so far, very impressed! I see a lot of potential to assist us with our data quality problems and the next step would be to try it on a bigger scale on our AWS instance.
Keep up the good work :+1: