Obfuscating an ID

georg picture georg · Dec 18, 2011 · Viewed 21.4k times · Source

I'm looking for a way to encrypt/obfuscate an integer ID into another integer. More precisely, I need a function int F(int x), so that

  • x<->F(x) is one-to-one correspondence (if x != y, F(x) != F(y))
  • given F(x), it's easy to find out x - so F is not a hash function
  • given x and F(x) it's hard/impossible to find out F(y), something like x ^ 0x1234 won't work

For clarity, I'm not looking for a strong encryption solution, it's only obfuscation. Imagine a web application with urls like example.com/profile/1, example.com/profile/2 etc. The profiles themselves are not secret, but I'd like to prevent casual voyeurs to view/fetch all profiles one after another, so I'd rather hide them behind something like example.com/profile/23423, example.com/profile/80980234 etc. Although database-stored tokens can do the job quite easily, I'm curious if there's some simple math available for this.

One important requirement I wasn't clear about is that results should look "random", that is, given a sequence x,x+1,...,x+n , F(x),F(x+1)...F(x+n) shouldn't form a progression of any kind.

Answer

Evgeny Kluev picture Evgeny Kluev · Dec 18, 2011

Obfuscate it with some combination of 2 or 3 simple methods:

  • XOR
  • shuffle individual bits
  • convert to modular representation (D.Knuth, Vol. 2, Chapter 4.3.2)
  • choose 32 (or 64) overlapping subsets of bits and XOR bits in each subset (parity bits of subsets)
  • represent it in variable-length numberic system and shuffle digits
  • choose a pair of odd integers x and y that are multiplicative inverses of each other (modulo 232), then multiply by x to obfuscate and multiply by y to restore, all multiplications are modulo 232 (source: "A practical use of multiplicative inverses" by Eric Lippert)

Variable-length numberic system method does not obey your "progression" requirement on its own. It always produces short arithmetic progressions. But when combined with some other method, it gives good results.

The same is true for the modular representation method.

Here is C++ code example for 3 of these methods. Shuffle bits example may use some different masks and distances to be more unpredictable. Other 2 examples are good for small numbers (just to give the idea). They should be extended to obfuscate all integer values properly.

// *** Numberic system base: (4, 3, 5) -> (5, 3, 4)
// In real life all the bases multiplied should be near 2^32
unsigned y = x/15 + ((x/5)%3)*4 + (x%5)*12; // obfuscate
unsigned z = y/12 + ((y/4)%3)*5 + (y%4)*15; // restore

// *** Shuffle bits (method used here is described in D.Knuth's vol.4a chapter 7.1.3)
const unsigned mask1 = 0x00550055; const unsigned d1 = 7;
const unsigned mask2 = 0x0000cccc; const unsigned d2 = 14;

// Obfuscate
unsigned t = (x ^ (x >> d1)) & mask1;
unsigned u = x ^ t ^ (t << d1);
t = (u ^ (u  >> d2)) & mask2;
y = u ^ t ^ (t << d2);

// Restore
t = (y ^ (y >> d2)) & mask2;
u = y ^ t ^ (t << d2);
t = (u ^ (u >> d1)) & mask1;
z = u ^ t ^ (t << d1);

// *** Subset parity
t = (x ^ (x >> 1)) & 0x44444444;
u = (x ^ (x << 2)) & 0xcccccccc;
y = ((x & 0x88888888) >> 3) | (t >> 1) | u; // obfuscate

t = ((y & 0x11111111) << 3) | (((y & 0x11111111) << 2) ^ ((y & 0x22222222) << 1));
z = t | ((t >> 2) ^ ((y >> 2) & 0x33333333)); // restore