GE in AWS Lambda

My intention with this discussion is to bring up the possibility to have a slim version of GE that can be used within a lambda function. This is different than putting it into an EC2 instance to work, because I’m looking for to have a serverless event-driven function that runs QA/QC for files coming into S3 buckets as part of an entire serverless flow.

I’ve tested some other functions such as pandera, which worked perfectly within a Lambda function. The problem is the entire GE distribution is too big to put it into a package, and AWS have a hard limit of 250 MB when unzipped for each package.

I did some research and removed these libraries, and made it smaller to fit the requirement (barely):

  • Notebook
  • IPython widgets
  • Jupyter client
  • Jupyter core
  • widgetsnbextensions

Can we have an official (better, mine is just removing things i guess won’t use) version of GE for AWS lambdas?


We don’t have an official distribution of Great Expectations for AWS Lambda yet, although we used it on Lambda internally. This is a great feature request!

Do you have some reproducible steps for removing the libraries?