I am getting confused with the meaning of "backbone" in neural networks, especially in the DeepLabv3+ paper. I did some research and found out that backbone could mean
the feature extraction part of a network
DeepLabv3+ took Xception and ResNet-101 as its backbone. However, I am not familiar with the entire structure of DeepLabv3+, which part the backbone refers to, and which parts remain the same?
A generalized description or definition of backbone would also be appreciated.
In my understanding, the "backbone" refers to the feature extracting network which is used within the DeepLab architecture. This feature extractor is used to encode the network's input into a certain feature representation. The DeepLab framework "wraps" functionalities around this feature extractor. By doing so, the feature extractor can be exchanged and a model can be chosen to fit the task at hand in terms of accuracy, efficiency, etc.
In case of DeepLab, the term backbone might refer to models like the ResNet, Xception, MobileNet, etc.