What happens when you do a conversion from AV_SAMPLE_FMT_S16P to AV_SAMPLE_FMT_S16? How is the AVFrame structure going to contain the planar and non-planar data?
AV_SAMPLE_FMT_S16P
is planar signed 16 bit audio, i.e. 2 bytes for each sample which is same for AV_SAMPLE_FMT_S16
.
The only difference is in AV_SAMPLE_FMT_S16
samples of each channel are interleaved i.e. if you have two channel audio then the samples buffer will look like
c1 c2 c1 c2 c1 c2 c1 c2...
where c1
is a sample for channel1 and c2
is sample for channel2.
while for one frame of planar audio you will have something like
c1 c1 c1 c1 .... c2 c2 c2 c2 ..
now how is it stored in AVFrame:
data[i] will contain the data of channel i (assuming channel 0 is first channel).
however if you have more channels than 8, then data for rest of the channels can be found in extended_data attribute of AVFrame.
data[0] will contain the data for all channels in an interleaved manner.