Create Binary files in UNIX

jaypal singh picture jaypal singh · Nov 10, 2011 · Viewed 9.6k times · Source

This question was out there for a while and I thought I should offer some bonus points if I can get it to work.

What did I do…

Recently at work, I wrote a parser that would convert a binary file in a readable format. Binary file isn't an Ascii file with 10101010 characters. It has been encoded in binary. So if I do a cat on the file, I get the following -

[jaypal~/Temp/GTP]$ cat T20111017153052.NEW 
==?sGTP?ղ?N????W????&Xx1?T?&Xx1?;
?d@#e?
      ?0H????????|?X?@@(?ղ??VtPOC01
cceE??k@9??W傇??R?K?i2??d@#e???&Xx1&Xx??!?
blackberrynet?/??!

??!

??#ripassword??W傅?W傆??0H??
                            #R??@Vtc@@(?ղ??n?POC01

So I used hexdump utility to make the file display following content and redirected it to a file. Now I had my output file which was a text file containing Hex values.

[jaypal~/Temp/GTP]$ hexdump -C T20111017153052.NEW 
00000000  3d 3d 01 f8 73 47 54 50  02 f1 d5 b2 be 4e e4 d7  |==..sGTP.....N..|
00000010  00 01 01 00 01 80 00 cc  57 e5 82 00 00 00 00 00  |........W.......|
00000020  00 00 00 00 00 00 00 00  87 d3 f5 13 00 00 00 00  |................|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 01 00 10  |................|
00000040  01 01 0f 00 00 00 00 00  26 58 78 31 00 b3 54 c5  |........&Xx1..T.|
00000050  26 58 78 31 00 b4 3b 0a  00 00 ad 64 13 40 01 03  |&Xx1..;....d.@..|
00000060  23 16 65 f3 01 01 0b 91  30 19 48 99 f2 ff ff ff  |#.e.....0.H.....|
00000070  ff ff ff 02 00 7c 00 dc  01 58 00 a0 40 40 28 02  |.....|...X..@@(.|
00000080  f1 d5 b2 b8 ca 56 74 50  4f 43 30 31 00 00 00 00  |.....VtPOC01....|
00000090  00 04 0a 63 63 07 00 00  00 00 00 00 00 00 00 00  |...cc...........|
000000a0  00 00 00 65 45 00 00 b4  fb 6b 40 00 39 11 16 cd  |[email protected]...|
000000b0  cc 57 e5 82 87 d3 f5 52  85 a1 08 4b 00 a0 69 02  |.W.....R...K..i.|
000000c0  32 10 00 90 00 00 00 00  ad 64 00 00 02 13 40 01  |2........d....@.|

After tons of awk, sed and cut, the script converted hex values into readable text. To do so, I used the offset positioning which would mark start and end position of each parameter converted. The resulting file after all conversion looks like this

[jaypal:~/Temp/GTP] cat textfile.txt 
Beginning of DB Package Identifier: ==
Total Package Length: 508
Offset to Data Record Count field: 115
Data Source: GTP
Timestamp: 2011-10-25
Matching Site Processor ID: 1
DB Package format version: 1
DB Package Resolution Type: 0
DB Package Resolution Value: 1
DB Package Resolution Cause Value: 128
Transport Protocol: 0
SGSN IP Address: 220.206.129.47
GGSN IP Address: 202.4.210.51

Why did I do it

I am a test engineer and to manually validate binary files was a major pain. I had to manually parse through the offsets and use a calculator to convert them and validate it against Wireshark and GUI.

Now the question part

I wish to do the reverse of what I did. This was my plan -

  • Have an easy to read Input text file which would have Parameters : Values.
  • User can simply put values next to them (eg Date would be a parameter and user can give date they want the data file to have).
  • The script will cut out all relevent information (user provided information) from the Input text file and convert them into hex values.
  • Once the file has been converted in to hex values, I wish to encode it back into binary.

First three steps are done

Problem

Once my script converts the Input text file in to a text file with hex values, I get a file like follows (notice I can do cat on it).

[visdba@hw-diam-test01 ParserDump]$ cat temp_file | sed 's/.\{32\}/&\n/g' | sed 's/../& /g'
3d 3d 01 fc 73 47 54 50 02 f1 d6 55 3c 9f 49 9c
00 01 01 00 01 80 00 dc ce 81 2f 00 00 00 00 00
00 00 00 00 00 00 00 00 ca 04 d2 33 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10
01 01 0f 00 00 07 04 ea 00 00 ff ff 00 00 14 b7
00 00 ff ff 00 00 83 ec 00 00 83 62 54 14 59 00
60 38 34 f5 01 01 0b 58 62 70 11 60 f6 ff ff ff
ff ff ff 02 00 7c 00 d0 01 4c 00 b0 40 40 28 02
f1 d6 55 38 cb 2b 23 50 4f 43 30 31 00 00 00 00
00 04 0a 63 63 07 00 00 00 00 00 00 00 00 00 00

My intension is to encoded this converted file in to a binary so that when I do cat on the file, I get bunch of garbage values.

[jaypal~/Temp/GTP]$ cat temp.file 
==?sGTP?ղ?N????W????&Xx1?T?&Xx1?;
?d@#e?
      ?0H????????|?X?@@(?ղ??VtPOC01
cceE??k@9??W傇??R?K?i2??d@#e???&Xx1&Xx??!?
blackberrynet?/??!

??!

So the question is this. How do I encode it in this form?

Why I want to do this?

We don't have a lot of GTP (GPRS Tunnelling Protocol) messages on production. I thought if I reverse engineer this, I could effectively create a data generator and make my own data.

Sum things up

There may be sophisticated tools out there, but I don't want to spend too much time learning them. It's been around 2 months, I have started working on the *nix platform and just getting hand around it's power tools like sed and awk.

What I do want is some help and guidance to make this happen.

Thanks again for reading! 200 points awaits for someone who can guide me in the right direction. :)

Sample Files

Here is a sample of Original Binary File

Here is a sample of Input Text File that would allow the User to punch in values

Here is a sample of File that my script creates after all the conversion from the Input Text File is complete.

How do I change the encoding of File 3 to File 1?

Answer

user353608 picture user353608 · Nov 28, 2011

You can use xxd to convert to and from binary files / hexdumps quite simply.

data to hex

echo  Hello | xxd -p 
48656c6c6f0a

hex to data

echo 48656c6c6f0a | xxd -r -p
Hello

or

echo 48 65 6c 6c 6f 0a | xxd -r -p
Hello

The -p is postscript mode which allows for a more freeform input

This is the output from xxd -r -p text where text is the data you give above

==▒sGTP▒▒U<▒I▒▒▒΁/▒▒3▒▒▒▒▒▒▒▒▒bTY`84▒
                                     Xbp`▒▒▒▒▒▒▒|▒L▒@@(▒▒U8▒+#POC01
:▒ިv▒b▒▒▒▒TY`84Ud▒▒▒▒>▒▒▒▒▒▒▒!▒
blackberrynet▒/▒▒!
M
▒▒!
N
▒▒#Oripassword▒▒΁/▒▒΁/▒▒Xbp`▒@@(▒▒U8▒IvPOC01
:qU▒b▒▒▒▒▒▒TY`84U▒▒▒*:▒▒!
▒k▒▒▒#O Welcmme!
▒!
M