No Zipping Needed: Importing Python Packages in AWS Lambda

Seamless Access to Pandas, Numpy, and Other Data Science Modules

Melissa Bain
3 min readSep 27, 2020

I recently decided to migrate a python script of mine to AWS Lambda in order to schedule a daily serverless run. Upon testing my new Lambda function, I encountered a No Module Named ‘pandas’ error. While you can hear the whole saga behind the project here, I wanted to offer others a concise set of instructions for importing some key packages into AWS Lambda without the usual hassle of zipping, version conflicts and dependency management.

When it works, it works

It’s important to note that many Python libraries are already available without any additional steps. For instance, I had no problem loading os or smptlib in my Lambda function. Here’s a more comprehensive list of such libraries.

If you’re reading this article, you’re probably trying to use a module not already accessible. No worries; there’s a few solutions. Initially I tried zipping Pandas with its dependent Numpy package and uploading these to AWS, as described in this blog post. However, a series of version conflicts led me to abandon this solution. I do mention this failed approach since any homemade or lesser-used module will likely necessitate a similar solve — may your luck be better than mine! Fortunately, for anyone looking to use one of the standard data science libraries, you’re already in luck! Read on for a straight forward solution to import Pandas, Numpy, Scipy, Matplotlib, and many other modules.

The Steps

We can leverage pre-made AWS Lamda Layers available through this git repo. According to AWS, “A layer is a ZIP archive that contains libraries, a custom runtime, or other dependencies.” In short, this solution allows us to utilize modules already configured, zipped and made accessible to Lambda. Here’s how to connect to the desired layer:

  • Identify which version of Python you’re using. There should be a “Basic settings” section in your AWS Lambda window with the Python version listed under “Runtime”. If you’re using 3.6, you’ll want to consider upgrading in order for this solution to work.
  • Next, identify which ARN — AWS Resource Name — you’re using. If you search “ARN”, it should be in the upper right hand corner of the Lambda page. For my project, it was us-east-2 which matches the Ohio location also displayed.
  • With these two pieces of information, you can navigate from the deployments folder inside the Git repo to the corresponding Python/ARN csv. For my specific set up I used this csv. Inside it there are many packages, including the module I needed: Pandas.
Choose the latest version of your desired package.
  • Lastly, add the information for your desired package into the layers section of AWS Lambda. You’ll want to name it Klayers-python<version number>-<package name>. For the “Version ARN” section, use the layer_version_arn column value in the csv. The layer version number can be found at the end of the layer_version_arn value . Here’s what mine ended up looking like:

You should now be all set to import your package in your Lambda Function. Happy coding!

--

--