Now we have used TensorFlow to train and export an model. We can implement the inference service with this model just like how tensorflow/serving
does.
I have a question about whether the tf.Session
object is thread-safe or not. If it's true, we may initialize the object after starting and use the singleton object to process the concurrent requests.
The tf.Session
object is thread-safe for Session.run()
calls from multiple threads.
Before TensorFlow 0.10 graph modification was not thread-safe. This was fixed in the 0.10 release, so you can add nodes to the graph concurrently with Session.run()
calls, although this is not advised for performance reasons; instead, it is recommended to call sess.graph.finalize()
before using the session from multiple threads, to prevent accidental memory leaks.