I have an input image 416x416. How can I create an output of 4 x 10, where 4 is number of columns and 10 the number of rows?
My label data is 2D array with 4 columns and 10 rows.
I know about the reshape()
method but it requires that the resulted shape has same number of elements as the input.
With 416 x 416 input size and max pools layers I can get max 13 x 13
output.
Is there a way to achieve 4x10
output without loss of data?
My input label data looks like for example like
[[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[ 0 0 0 0]
[116 16 128 51]
[132 16 149 52]
[ 68 31 77 88]
[ 79 34 96 92]
[126 37 147 112]
[100 41 126 116]]
Which indicates there are 6 objects on my images that i want to detect, first value is xmin, second ymin , third xmax, fourth ymax.
The last layer of my networks looks like
(None, 13, 13, 1024)
First flatten the (None, 13, 13, 1024)
layer
model.add(Flatten())
it will give 13*13*1024=173056
1 dimensional tensor
Then add a dense layer
model.add(Dense(4*10))
it will output to 40
this will transform your 3D shape to 1D
then simply resize to your needs
model.add(Reshape(4,10))
This will work but will absolutely destroy the spatial nature of your data