tensorflow using 2 GPU at the same time

Maystro picture Maystro · May 23, 2017 · Viewed 14.1k times · Source

First, I'm still newbie in tensorflow. I'm using v0.9 and trying to use the 2 GPUs installed in the machine we have. So, here is what's happening:

  1. When I launch a training data script on the machine, it works only on one of the 2 GPUs. It takes the first one by default gpu:0/.
  2. When I launch another training data script to run on the second GPU (after doing the changes needed i.e. with tf.device..) while keeping the first process running on the first GPU, tensorflow kills the first process and use only the second GPU to run the second process. So it seems only one process at a time is allowed by tensorflow?

What I need is: to be able to launch two separate training data scripts for 2 differents models on 2 different GPUs installed on the same machine. Am I missing something in this case? Is this the expected behavior? Should I go through distributed tensorflow on a local machine to do so?

Answer

nessuno picture nessuno · May 23, 2017

Tensorflow tries to allocate some space on every GPU it sees.

To work around this, make Tensorflow see a single (and different) GPU for every script: to do that, you have to use the environment variable CUDA_VISIBLE_DEVICES in this way:

CUDA_VISIBLE_DEVICES=0 python script_one.py
CUDA_VISIBLE_DEVICES=1 python script_two.py

In both script_one.py and script_two.py use tf.device("/gpu:0") to place the device on the only GPU that it sees.