We are using SMPP cloud-hopper library to SMS long long messages to SMS gateway Innovativetxt.com, but it seems like when we split following the long message TO 140 bytes each part. The number of characters in each message gets to 134 character.
However industry standard is kind of 153 character shall be for each part of GSM Encoded long message. Is it something wrong we are doing by having only 134 character when we split via 140 byte? If we trying to submit greater than 140 bytes message, the gateway provider rejects it with message oversized message body.
Shall be split the message to 153 character each to sbumit to SMSC, instead spiting the messages via 140 bytes each.
What is the best way to split long message? By message size i.e 140 bytes or message characters count?
Anyone faced same issues via cloudhopper or other Java-based Library what we shall do.
It's a common confusion. You are doing everything right. Message lengths may be 160 chars (7-bit GSM 03.38), 140 chars (8-bit Latin), 70 chars (16-bit UCS-2). Notice: 160 * 7 == 140 * 8 == 70 * 16.
When you split a long message additional info like total parts number and part index is stored in the message body, so-called User Data Header (UDH). This header also takes place. So, with UDH you left with 153 GSM chars (7-bit), 134 chars/bytes (8-bit) payload or 67 2bytes-unicode chars (16-bit)
See also http://www.nowsms.com/long-sms-text-messages-and-the-160-character-limit
The UDH is 6 bytes long for Contatenated message 8-bit as in your case.
UDH structure
0x05: Length of UDH (5 bytes to follow) 0x00: Concatenated message Information Element (8-bit reference number) 0x03: Length of Information Element data (3 bytes to follow) 0xXX: Reference number for this concatenated message 0xYY: Number of fragments in the concatenated message 0xZZ: Fragment number/index within the concatenated message
Total message length, bits: 160*7 = 140*8 = 1120 UDH length, bits: 6*8 = 48 Left payload, bits: 1120-48 = 1072
For GSM 03.38 you get 1072/7 = 153 GSM (7-bit) chars + 1 filling unused bit. For Latin you get 1072/8 = 134 (8-bit) chars. For UCS-2 you get 1072/16 = 67 (16-bit) chars.
As you can see 153 GSM chars equals to 134 bytes minus 1 bit. Probably these 134 chars is what Java reports you. But once you split your long text message you end up with a binary message containing both text and UDH. And you should treat the message as binary. I suggest you to make binary dumps out of the resulting parts and investigate them.