Publishing a TensorFlow model on AWS Lambda to make a sell-able API

In Data Science, a productive model is an amazing work from you. You want to sell your excellent model to a Financial Business or a potential StartUp through the API subscription method. AWS Lambda and API Gateway will be the best choice for you. Deploying your model into production and scale-out them for thousands of thousands of users is almost done if it works on Serverless architecture. Doesn’t like deploying on an EC2, it costs you nothing if there is no usage.

Aug 26, 2020 · 5 min read

There are three popular frameworks using Python: SciKit-Learn, TensorFlow, and PyTorch. Today, TensorFlow is used widely for Deep Learning along with PyTorch while SciKit-learn is used for Machine Learning at most. In Mar 2019, Google has released a Tensorflow Lite version (TFLite). The lite version intends to use for Smart Phone and Embedded application. This is significant good news for Data Scientists.

It’s important to reduce the footprint of a Machine Learning or Neural Network model that runs in devices such as a Raspberry PI. When distributing it at client devices. It isn’t easy to protect your intellectual property from Jailbreaking. You need to keep your model on the server-side then exposing an API for any usage.

Does a full-version of Deep Learning frameworks is adaptive to modern software architecture such as Lambda serverless? No, it’s designed to support an end to end solutions for Data Science. Therefore, the full version of those frameworks is too large to fit in 250MB limited space of a Lambda function. To install the Tensorflow package, it takes nearly 1GB space!!!

AWS has just released a Lambda EFS feature that allows you to attach an HDD-like to your Lambda. This will solve the problem of a very big size model file (over 250MB). To use an EFS, you must configure a lot of steps before being able to use it with Lambda.

Why I cannot use TFLite with AWS Lambda?

TFLite has a small footprint, so what is the troubling thing to use with Python running on a Lambda function? It’s all about the native builds for different processors, OS platforms, and customized kernels. AWS Lambda runs on its own, customized Linux so that, the pre-build TFLite from the community doesn’t work.

What can I do to make one of them works?

Let’s take TensorFlow to be the case. Cook it yourself to fit your needs! That’s all the stuff you have to do. But cook the Tensorflow Lite from SourceCode in the Amazon Linux platform resulting to have a compatible binary with AWS Lambda runtime. How to build a native library, cross-platform from SourceCode in my PC, this was a nightmare story when I was in the year of 2000. Now with Docker and the community, you are happy and very lucky. It is done easily within 15 minutes. Let’s create a Dockerfile:

FROM amazonlinuxWORKDIR /tfliteRUN yum groupinstall -y developmentRUN yum install -y python3.7RUN yum install -y python3-develRUN pip3 install numpy wheel pybind11RUN git clone --branch v2.3.0 https://github.com/tensorflow/tensorflow.gitRUN sh ./tensorflow/tensorflow/lite/tools/make/download_dependencies.shRUN sh ./tensorflow/tensorflow/lite/tools/pip_package/build_pip_package.shRUN pip3 install tensorflow/tensorflow/lite/tools/pip_package/gen/tflite_pip/python3/dist/tflite_runtime-2.3.0-cp37-cp37m-linux_x86_64.whlCMD tail -f /dev/null

All the things you need is to build a docker image that compiles the Tensorflow Lite library inside the amazonlinux image: (you can skip this step if you don’t have a Docker machine in your PC or don’t want to build this!)

docker build -t tflite_amazonlinux .

This process will run in a couple of dozen minutes depending on your machine computation speed. There is a pre-built tflite_runtime and numpy library to be ready to use if you use simplify-cli to generate your Serverless project.

A Pre-built TFLite library for AWS Lambda:

Simplify CLI offers you a tool to create a Serverless project, manage the deployment and its layers gracefully. Now, let’s create a Lambda function with “simplify-cli”.

npm install -g simplify-cli         # install serverless frameworkmkdir tensorflow-lite               # create a project foldercd tensorflow-lite                  # enter this project foldersimplify-cli init -t python         # generate a python project

In this default project, the file main.py will use the tflite_runtime library to load a pre-built model named detect_object.tflite that was generated before.

Checkout the “tflite-python-layer” repository into the layer folder:

git clone https://github.com/simplify-framework/tflite-python-layer layer

In tensorflow-lite/layer/python folder, there are two pre-built libraries for running on AWS Lambda:

tflite_runtime (2.3.0)
numpy (1.19.1)

Everything you need to run your project is to setup two variables inside the .env file. To do so, you need an AWS Account with a Credential setup as a Profile or leave it blank if you use a default one.

https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-profiles.html

### - Application DeploymentDEPLOYMENT_ENV=demoDEPLOYMENT_BUCKET=tensorflow-deployment-37216DEPLOYMENT_REGION=eu-central-1DEPLOYMENT_ACCOUNT=your-aws-accountDEPLOYMENT_PROFILE=your-aws-profle### - Application StackNamePROJECT_NAME=TensorFlowTest### - Backend Serverless LambdaFUNCTION_NAME=detectObjectFUNCTION_RUNTIME=python3.7FUNCTION_HANDLER=main.handlerFUNCTION_SOURCE=src

The “37216” number you should change it to not having a conflicted bucket name with the other one who pickup this number before your test.

Publishing your Python code to AWS Lambda service with this command:

simplify-cli deploy

Then, deploy its layer that contains your TFLite library and numpy:

simplify-cli deploy --layer --source layer

Finally, going to AWS Console, look for your Lambda function named as detectObject-demo then Test your code. You can see this link if you don’t know how to run a Test for your Lambda: https://aws.amazon.com/getting-started/hands-on/run-serverless-code/. You just need to do the “Step 4: Invoke Lambda Function and Verify Results”.

(To continue with the Docker build, there is a script for your last step. That is the layer/build-tflite.sh. It will take out the result from Docker build to the layer folder.)

Testing with library loading time. The Lambda cold start took ~8 seconds. After the first request, it just takes around 25 milliseconds.

Organize your Lambda with an API Gateway and AWS Marketplace. After finish this setup, you will be ready to sell your API to the world. Then, let’s start to find your customers.

ORIGINAL Blog