How to deploy and serve prediction using TensorFlow from API?

rayhan picture rayhan · Jan 27, 2016 · Viewed 7.5k times · Source

From google tutorial we know how to train a model in TensorFlow. But what is the best way to save a trained model, then serve the prediction using a basic minimal python api in production server.

My question is basically for TensorFlow best practices to save the model and serve prediction on live server without compromising speed and memory issue. Since the API server will be running on the background for forever.

A small snippet of python code will be appreciated.

Answer

Jarek Wilkiewicz picture Jarek Wilkiewicz · Feb 16, 2016

TensorFlow Serving is a high performance, open source serving system for machine learning models, designed for production environments and optimized for TensorFlow. The initial release contains C++ server and Python client examples based on gRPC. The basic architecture is shown in the diagram below.

enter image description here

To get started quickly, check out the tutorial.