_C.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIdEEPKNS_6detail12TypeMetaDataEv

Mona Jalal picture Mona Jalal · Apr 16, 2021 · Viewed 7.5k times · Source

What is the reason for this error and how can I fix it? I am running the code from this repo: https://github.com/facebookresearch/frankmocap

(frank) mona@goku:~/research/code/frankmocap$ python -m demo.demo_frankmocap --input_path ./sample_data/han_short.mp4 --out_dir ./mocap_output
Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/mona/research/code/frankmocap/demo/demo_frankmocap.py", line 25, in <module>
    from handmocap.hand_bbox_detector import HandBboxDetector
  File "/home/mona/research/code/frankmocap/handmocap/hand_bbox_detector.py", line 33, in <module>
    from detectors.hand_object_detector.lib.model.roi_layers import nms # might raise segmentation fault at the end of program
  File "/home/mona/research/code/frankmocap/detectors/hand_object_detector/lib/model/roi_layers/__init__.py", line 3, in <module>
    from .nms import nms
  File "/home/mona/research/code/frankmocap/detectors/hand_object_detector/lib/model/roi_layers/nms.py", line 3, in <module>
    from model import _C
ImportError: /home/mona/research/code/frankmocap/detectors/hand_object_detector/lib/model/_C.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe28TypeMeta21_typeMetaDataInstanceIdEEPKNS_6detail12TypeMetaDataEv

I have:

$ lsb_release -a
LSB Version:    core-11.1.0ubuntu2-noarch:security-11.1.0ubuntu2-noarch
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.2 LTS
Release:    20.04
Codename:   focal

and

$ python
Python 3.8.5 (default, Jan 27 2021, 15:41:15) 
[GCC 9.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.__version__
'1.8.1+cu111'
>>> import detectron2
>>> detectron2.__version__
'0.4'
>>> from detectron2 import _C

and:

$ python -m detectron2.utils.collect_env
/home/mona/venv/frank/lib/python3.8/site-packages/torch/cuda/__init__.py:52: UserWarning: CUDA initialization: Unexpected error from cudaGetDeviceCount(). Did you run some cuda functions before calling NumCudaDevices() that might have already set an error? Error 804: forward compatibility was attempted on non supported HW (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:109.)
  return torch._C._cuda_getDeviceCount() > 0
No CUDA runtime is found, using CUDA_HOME='/usr'
---------------------  --------------------------------------------------------------------------
sys.platform           linux
Python                 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0]
numpy                  1.19.5
detectron2             0.4 @/home/mona/venv/frank/lib/python3.8/site-packages/detectron2
Compiler               GCC 7.3
CUDA compiler          CUDA 11.1
DETECTRON2_ENV_MODULE  <not set>
PyTorch                1.8.1+cu111 @/home/mona/venv/frank/lib/python3.8/site-packages/torch
PyTorch debug build    False
GPU available          False
Pillow                 8.1.0
torchvision            0.9.1+cu111 @/home/mona/venv/frank/lib/python3.8/site-packages/torchvision
fvcore                 0.1.3.post20210311
cv2                    4.5.1
---------------------  --------------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

Answer

Ellon picture Ellon · Nov 29, 2021

This error usually shows up when there's a compatibility issue between the installed pytorch version and the detector library version(Detectron2 or mmdet).

Both the detector library and pytorch have to be built by the same CUDA version otherwise some packages will conflict when training your model.

There's a possibility that the Pytorch (1.8.1) + CUDA version (11.1) you have is incompatible with detectron2 v(0.4).

From the repo here, detectron2 v0.4 is built with torch 1.8 + cuda 11.1. It might help if you use torch 1.8.0 instead of 1.8.1