Say we have a single channel image (5x5)
A = [ 1 2 3 4 5
6 7 8 9 2
1 4 5 6 3
4 5 6 7 4
3 4 5 6 2 ]
And a filter K (2x2)
K = [ 1 1
1 1 ]
An example of applying convolution (let us take the first 2x2 from A) would be
1*1 + 2*1 + 6*1 + 7*1 = 16
This is very straightforward. But let us introduce a depth factor to matrix A i.e., RGB image with 3 channels or even conv layers in a deep network (with depth = 512 maybe). How would the convolution operation be done with the same filter ? A similiar work out will be really helpful for an RGB case.
Lets say we have a 3 Channel (RGB) image given by some matrix A
A = [[[198 218 227] [196 216 225] [196 214 224] ... ... [185 201 217] [176 192 208] [162 178 194]]
and a blur kernal as
K = [[0.1111, 0.1111, 0.1111], [0.1111, 0.1111, 0.1111], [0.1111, 0.1111, 0.1111]] #which is actually 0.111 ~= 1/9
The convolution can be represented as shown in the image below
As you can see in the image, each channel is individually convoluted and then combined to form a pixel.