How do I deploy GE on AWS lambda?

Several people have suggested that we provide instructions for deploying GE on AWS lambda. I’ve created a test repo that (partially) implements that workflow: https://github.com/superconductive/great_expectations_lambda

Note: this doesn’t actually run great_expectations, it just manages all the layers and dependencies so that you can import great_expectations

From that point, there are a whole bunch of different ways you might want to use GE within lambda

Contributions and forks are very welcome!

Hi Abegong,

Happy to contribute our work where we build the zip file on a reproducible manner.

May I ask you on what platform you used to create the build in build_layer.sh? We’ve had issues with some c libraries when building on a windows machine and running the lambda’s amazonlinux distro (why we use docker to build it). Just wondering if you’ve had similar issues.

Ha! This is exactly the step that I had trouble reproducing. You need to build on an AWS AMI or docker image compatible with lambda, and we—alas—did not document that step.

Let me ask others on the team if they can fill in details.

https://aws.amazon.com/premiumsupport/knowledge-center/lambda-layer-simulated-docker/

Ah right!

Yeah so I solved it to pull the docker image for amazonlinux (https://hub.docker.com/_/amazonlinux/ same as the lambda uses) and build it from there, but this seems like a good solution too.

tl;dr: use the Amazon Linux docker image (as @joostboonzajerflaes suggests), or an EC2 instance with an Amazon Linux AMI (@beau’s preferred method)

As you note, it is necessary to perform builds under amazon linux. This docs page provides the details on how various lambda runtimes map to the underlying environment.

There are two main ways to easily create an environment that will allow you to transfer builds into a lambda runtime.

  1. You’ve already discovered one, which is to use a suitable docker container. For others’ reference, those docker images can be found here on docker hub.

  2. Because docker is not part of my own dev workflow, I avoid it when possible. Instead, I spin up a suitable Amazon Linux EC2 instance for these builds. If you use the EC2 console to perform the launch, the appropriate Amazon Linux AMIs appear at the very top of their Quick Start list. Here is the complete list of AMI IDs by region and storage type.

What is the difference between Amazon Linux and Amazon Linux 2?

Glad you asked. This overview page provides the official explanation, but I’ve found it a bit hard to parse the distinction. If I understand correctly, AWS found their own AMI infrastructure somewhat limiting when providing LTS for their own operating system, so for AL2 they moved to a “base VM image + updatable packages” model. That said, you can still install AL2 from AMIs.

All that said, I use the python3.7 runtime for my lambdas, which (as one of the docs linked above says) uses the original Amazon Linux environment. And finally, I’ve unintentionally used AL2 for some of my python3.7 lambda builds, and they have worked just fine when deployed.

I hope I’m answering the question you have, but let me know.

1 Like

Hi Beau,

Thanks for the comment! For our stack we use a mixture of publishing artefacts through a build pipeline (which is why spinning up an EC2 instance is not really an option as we use Azure DevOps for that). Those artefacts are the pushed to S3 which is picked up when we create the lambda (using cloudformation). Happy to hear if you have any alternative solution/ remarks!

Just for my understanding, where is the question coming from for Amazon Linux and Amazon Linux 2, or am I missing something?