"RFC 2833 RTP Event" Consecutive Events and the E "End" Bit

brian_d picture brian_d · May 4, 2010 · Viewed 16.6k times · Source

Why do I get a dtmf sound when the E bit is 0 and no sound when it is 1? (RTP packets appear in wireshark either way)

Background:

I can send out a RFC 2833 dtmf event as outlined at http://www.ietf.org/rfc/rfc2833.txt obtaining the following behaviour when the E bit is NOT set:

If for example keys 7874556332111111145855885#3 are pressed, then ALL events are sent and show up in a program like wireshark, however only 87456321458585#3 sound. So the first key (which I figure could be a separate issue) and any repeats of an event (ie 11111) are failing to sound.

In section 3.9, figure 2 of the above linked document, they give a 911 example where all but the last event have the E bit set.

When I set the 'E' bit to 1 for all numbers, I never get an event to sound.

I have thought of some possible causes but do not know if they are the reason:

1) figure 2 shows payload types of 96 and 97 sent. I have not sent these headers. In section 3.8, codes 96 and 97 are described as "the dynamic payload types 96 and 97 have been assigned for the redundancy mechanism and the telephone event payload respectively"

2) In section 3.5, "E:", "A sender MAY delay setting the end bit until retransmitting the last packet for a tone, rather than on its first transmission" Does anyone have an idea of how to actually do this?

3) I have a separate output stream that also plays so wonder if it might be interfering with hearing this stream.

4) have also fiddled around with timestamp intervals and the RTP marker.

Any help is greatly appreciated. Here is a sample wireshark event capture of the relevant areas:

6590 31.159045000 xx.x.x.xxx --.--.---.-- RTP EVENT Payload type=RTP Event, DTMF Pound # (end)
Real-Time Transport Protocol
 Stream setup by SDP (frame 6225)
  Setup frame: 6225
  Setup Method: SDP
 10.. .... = Version: RFC 1889 Version (2)
 ..0. .... = Padding: False
 ...0 .... = Extension: False
 .... 0000 = Contributing source identifiers count: 0
 0... .... = Marker: False
 Payload type: telephone-event (101)
 Sequence number: 0
 Extended sequence number: 65536
 Timestamp: 2000
 Synchronization Source identifier: 0x15f27104 (368210180)
RFC 2833 RTP Event
 Event ID: DTMF Pound # (11)
 1... .... = End of Event: True
 .0.. .... = Reserved: False
 ..00 0000 = Volume: 0
 Event Duration: 1000

Please note: A volume of zero is the loudest obtainable level as explained in the ietf.org/rfc/rfc2833.txt specification:

"volume: For DTMF digits and other events representable as tones, this field describes the power level of the tone, expressed in dBm0 after dropping the sign. Power levels range from 0 to -63 dBm0. The range of valid DTMF is from 0 to -36 dBm0 (must accept); lower than -55 dBm0 must be rejected (TR-TSY-000181, ITU-T Q.24A). Thus, larger values denote lower volume. This value is defined only for DTMF digits. For other events, it is set to zero by the sender and is ignored by the receiver." The issue is when the "End of Event" bit is switched on.

Answer

Laurent Etiemble picture Laurent Etiemble · May 12, 2010

I recommend you to start with the RFC 4733 for two reasons:

  1. It obsolotes the RFC 2833.
  2. The chapter 5. is a great source to understand how a DTMF digit is produced.

Here is my understanding of how a DTMF digit should be sent:

  • A start packet is emitted. It has its M flag set and the E flag cleared. The timestamp for the event is set.
  • One or more continuation packets are emitted (as long as the user pressed the digit). They have theirs M And E flags cleared. They use the timestamp defined in the start packet, but their sequence numbers and their duration are incremented (see the RFC for the intervals).
  • An end packet is sent (when the user stop pressing the digit). It has it M flag cleared and its E flag set.

Why should several packets be sent for one event ? Because the network is not always perfect and some loss can occur:

  • The RFC states (2.5.1.2. "Transmission of Event Packets") that:

    For robustness, the sender SHOULD retransmit "state" events periodically.

  • And (2.5.1.4. "Retransmission of Final Packet") that:

    The final packet for each event and for each segment SHOULD be sent a
    total of three times at the interval used by the source for updates.
    This ensures that the duration of the event or segment can be recognized correctly even if an instance of the last packet is lost.

As for your problem:

If for example keys 7874556332111111145855885#3 are pressed, then ALL events are sent and show up in a program like wireshark, however only 87456321458585#3 sound. So the first key (which I figure could be a separate issue) and any repeats of an event (ie 11111) are failing to sound.

Without a WireShark trace, it is hard to tell what is going on, but I suspect that the repeating 1 digits are ignored because there is not distinction between successive events; the first 1 digit is recognized and the others are considered as retransmissions of the event.