h264 RTP timestamp

user269090 picture user269090 · Mar 13, 2010 · Viewed 18.9k times · Source

I have a confusion about the timestamp of h264 RTP packet. I know the wall clock rate of video is 90KHz which I defined in the SIP SDP. The frame rate of my encoder is not exactly 30 FPS, it is variable. It varies from 15 FPS to 30 FPS on the fly. So, I cannot use any fixed timestamp.

Could any one tell me the timestamp of the following encoded packet.
After 0 milisecond encoded RTP timestamp = 0 (Let the starting timestamp 0)
After 50 milisecond encoded RTP timestamp = ?
After 40 milisecond encoded RTP timestamp = ?
After 33 milisecond encoded RTP timestamp = ?

What is the formula when the encoded frame rate is variable?

Thank you in advance.

Answer

Cipi picture Cipi · May 31, 2010

It doesn't matter if your encoder encodes video at 10FPS or 30FPS, with RTP timestamp you tell the receiver how long is the pause between the two frames. So you determine that on the fly for each frame. That way you can send 10 frames in one second (10fps), and in other second you can send 30 frames (30 fps). You only need to set the RTP timestamp correctly. And if I get your question, you are in doubt how to do this...

Let the starting time stamp be 0, you add the wall clock time in milliseconds multiplied by 100 to the last RTP timestamp, or you can use any time scale you want. To make the decoder decode 10fps video at 30fps, add 333000 to RTP timestamp for each packet... but lets look at your example:

Frame #      RTP Time   Time between frames [ms]
[  1]               0   0
[  2]           50000   50
[  3]           90000   40
[  4]          420000   33  

So if you set RTP timestamp like this (Time in ms * 100000) you will make the decoder load and decode Frame 1, and then load and decode Frame 2, but it will sleep for 50 ms (time difference between Frame 1 and Frame 2) before it draws the Frame 2, and so on...

And as you can see, the decoder uses RTP timestamps to know when to display each one, and it doesnt mind if the video was encoded at 30 or 10 fps.

Also, if the video is 30 fps, that doesnt mean that for each second there will be 30 RTP packets. Sometimes there can be more then 100, so you can not have a formula that ensures the correct RTP timestamp calculation.

I guess that this is what you need... hope I helped, dont -1 me if I didnt... =)