I have been posting similar questions here for a couple of days now, but it seems like I was not asking the right thing, so excuse me if I have exhausted you with my XOR questions :D.
To the point - I have two hex strings and I want to XOR these strings such that each byte is XORed separately (i.e. each pair of numbers is XORed separately). And I want to do this in python, and I want to be able to have strings of different lengths. I will do an example manually to illustrate my point (I used the code environment because it allows me to put in spaces where I want them to be):
Input:
s1 = "48656c6c6f"
s2 = "61736b"
Encoding in binary:
48 65 6c 6c 6f = 01001000 01100101 01101100 01101100 01101111
61 73 6b = 01100001 01110011 01101011
XORing the strings:
01001000 01100101 01101100 01101100 01101111
01100001 01110011 01101011
00001101 00011111 00000100
Converting the result to hex:
00001101 00011111 00000100 = 0d 1f 04
Output:
0d1f04
So, to summarize, I want to be able to input two hex strings (these will usually be ASCII letters encoded in hex) of different or equal length, and get their XOR such that each byte is XORed separately.
Use binascii.unhexlify()
to turn your hex strings to binary data, then XOR that, going back to hex with binascii.hexlify()
:
>>> from binascii import unhexlify, hexlify
>>> s1 = "48656c6c6f"
>>> s2 = "61736b"
>>> hexlify(''.join(chr(ord(c1) ^ ord(c2)) for c1, c2 in zip(unhexlify(s1[-len(s2):]), unhexlify(s2))))
'0d1f04'
The actual XOR is applied per byte of the decoded data (using ord()
and chr()
to go to and from integers).
Note that like in your example, I truncated s1
to be the same length as s2
(ignoring characters from the start of s1
). You can encode all of s1
with a shorter key s2
by cycling the bytes:
>>> from itertools import cycle
>>> hexlify(''.join(chr(ord(c1) ^ ord(c2)) for c1, c2 in zip(unhexlify(s1), cycle(unhexlify(s2)))))
'2916070d1c'
You don't have to use unhexlify()
, but it is a lot easier than looping over s1
and s2
2 characters at a time and using int(twocharacters, 16)
to turn that into integer values for XOR operations.
The Python 3 version of the above is a little lighter; use bytes()
instead of str.join()
and you can drop the chr()
and ord()
calls as you get to iterate over integers directly:
>>> from binascii import unhexlify, hexlify
>>> s1 = "48656c6c6f"
>>> s2 = "61736b"
>>> hexlify(bytes(c1 ^ c2 for c1, c2 in zip(unhexlify(s1[-len(s2):]), unhexlify(s2))))
b'0d1f04'
>>> from itertools import cycle
>>> hexlify(bytes(c1 ^ c2 for c1, c2 in zip(unhexlify(s1), cycle(unhexlify(s2)))))
b'2916070d1c'