Python project using protocol buffers, Deployment issues

jan picture jan · Jan 8, 2015 · Viewed 9.3k times · Source

I have a Python project that uses setuptools for deployment and I mostly followed this guide regarding project structure. The project uses Google Protocol Buffers to define a network message format. My main issue is how to make setup.py call the protoc-compiler during installation to build the definitions into a _pb2.py file.

In this question the advice was given to just distribute the resulting _pb2.py files along with the project. While this might work for very similar platforms, I've found several cases where this does not work. For example, when I develop on a Mac that uses Anaconda Python and copy the resulting _pb2.py along with the rest of the project to a Raspberry Pi running Raspbian, there are always import errors coming from the _pb2.py modules. However, if I compile the .proto files freshly on the Pi, the project works as expected. So, distributing the compiled files does not seem like an option.

Kind of looking for working and best practice solutions here. It can be assumed that the protoc-compiler is installed on the target platform.

Edit:

Since people ask for the reasons of the failure. On the Mac, the protobuf version is 2.6.1. and on the Pi it's 2.4.1. Apparently, the internal API as used by the generated protoc compiler output has changed. The output is basically:

  File "[...]network_manager.py", line 8, in <module>
      import InstrumentControl.transports.serial_bridge_protocol_pb2 as protocol
  File "[...]serial_bridge_protocol_pb2.py", line 9, in <module>
      from google.protobuf import symbol_database as _symbol_database
  ImportError: cannot import name symbol_database

Answer

jan picture jan · Jan 8, 2015

Ok, I solved the issue without requiring the user to install a specific old version or compile the proto files on another platform than my dev machine. It's inspired by this setup.py script from protobuf itself.

Firstly, protoc needs to be found, this can be done using

# Find the Protocol Compiler.
if 'PROTOC' in os.environ and os.path.exists(os.environ['PROTOC']):
  protoc = os.environ['PROTOC']
else:
  protoc = find_executable("protoc")

This function will compile a .proto file and put the _pb2.py in the same spot. However, the behavior can be changed arbitrarily.

def generate_proto(source):
  """Invokes the Protocol Compiler to generate a _pb2.py from the given
  .proto file.  Does nothing if the output already exists and is newer than
  the input."""

  output = source.replace(".proto", "_pb2.py")

  if (not os.path.exists(output) or
      (os.path.exists(source) and
       os.path.getmtime(source) > os.path.getmtime(output))):
    print "Generating %s..." % output

    if not os.path.exists(source):
      sys.stderr.write("Can't find required file: %s\n" % source)
      sys.exit(-1)

    if protoc == None:
      sys.stderr.write(
          "Protocol buffers compiler 'protoc' not installed or not found.\n"
          )
      sys.exit(-1)

    protoc_command = [ protoc, "-I.", "--python_out=.", source ]
    if subprocess.call(protoc_command) != 0:
      sys.exit(-1)

Next, the classes _build_py and _clean are derived to add building and cleaning up the protocol buffers.

# List of all .proto files
proto_src = ['file1.proto', 'path/to/file2.proto']

class build_py(_build_py):
  def run(self):
    for f in proto_src:
        generate_proto(f)
    _build_py.run(self)

class clean(_clean):
  def run(self):
    # Delete generated files in the code tree.
    for (dirpath, dirnames, filenames) in os.walk("."):
      for filename in filenames:
        filepath = os.path.join(dirpath, filename)
        if filepath.endswith("_pb2.py"):
          os.remove(filepath)
    # _clean is an old-style class, so super() doesn't work.
    _clean.run(self)

And finally, the parameter

cmdclass = { 'clean': clean, 'build_py': build_py }   

needs to be added to the call to setup and everything should work. Still have to check for possible quirks, but so far it works flawlessly on the Mac and on the Pi.