Meaning of "data_coding" field in SMPP

Prasad Weera picture Prasad Weera · Aug 16, 2012 · Viewed 15k times · Source

What is the meaning of "data_coding" field in the SMPP protocol?

I searched for this but couldn't find any helpful resource.

Answer

puzz picture puzz · Aug 16, 2012

In short, datacoding contains the information on how the text in an smpp SubmitSM (i.e. a typical SMS message) message is encoded. The SubmitSM packet contains a binary encoded body, and the dataCoding is how the text is stored in this body.

The most important values are:

  • 00000000 (0) - usually GSM7 (the default 7 bit encoding for messages, with a few characters that are encoded as two bytes), but technically could be something else
  • 00000011 (3) for standard ISO-8859-1
  • 00001000 (8) for the universal character set -- de facto UTF-16

Other possible values (rarely used):

  • 00000001 - IA5_CCITT_T_50_ASCII_ANSI_X3_4
  • 00000010 - OCTET_UNSPECIFIED_8BIT_BINARY_1
  • 00000100 - OCTET_UNSPECIFIED_8BIT_BINARY_2
  • 00000101 - JIS_X_02081990
  • 00000110 - CYRLLIC_ISO88595
  • 00000111 - LATIN_HEBREW_ISO88598
  • 00001001 - PICTOGRAM_ENCODING
  • 00001010 - ISO2022JP_MUSIC_CODES
  • 00001101 - EXTENDED_KANJI_JISX_02121990
  • 00001110 - KS_C_5601

And two reserved for special uses:

  • 00001011 - RESERVED #1
  • 00001100 - RESERVED #2

In short, if your binary body is unicode (UTF-16) you will set dataCoding to 8. If your message is stored as GSM7 then it will (usually) be 0.