Is there any example code for deploying a Tensorflow Model via a RESTful API? I see examples for a command line program and for a mobile app. Is there a framework for this or people just load the model and expose the predict method via a web framework (like Flask)to take input (say via JSON) and return the response? By framework I mean scaling for large number of predict requests. Of course since the models are immutable we can launch multiple instances of our prediction server and put it behind a load balancer (like HAProxy). My question is, are people using some framework for this or doing this from scratch, or, maybe this is already available in Tensorflow and I have not noticed it.
https://github.com/sugyan/tensorflow-mnist shows a simple restAPI example by using Flask and loading pre-trained mode (restore).
@app.route('/api/mnist', methods=['POST'])
def mnist():
input = ((255 - np.array(request.json, dtype=np.uint8)) / 255.0).reshape(1, 784)
output1 = simple(input)
output2 = convolutional(input)
return jsonify(results=[output1, output2])
Also, see the online demo at https://tensorflow-mnist.herokuapp.com/. It seems the API is fast enough.