Algorithm to get all possible string combinations from array up to certain length

KingKongFrog picture KingKongFrog · Sep 6, 2012 · Viewed 10k times · Source

What is the best algorithm to get all possible string combinations from a given array with a minimum & maximum length value.

Note: This adds complexity since the value is variable unlike the questions these are linked to.

For example:

$letters = array('a','b','c','1','2','3');
$min_length = 1;
$max_length = 4;

a
b
c
1
2
3
.
.
.
aaaa
a123
b123
c123

Answer

beaker picture beaker · Mar 25, 2013

Almost a Base Conversion

This solution is motivated by the observation that, if it were not for the possibility of repeating the character at array index 0 in the high-order position of a valid combination, this problem would simply be a base conversion from decimal to the new base of all integers from 0 to (base^length)-1. So,

0 = a
1 = b
...
1294 = 3332
1295 = 3333

The difficulty is that this misses combinations with one or more leading 'a', like:

aa
...
a1
aa1
aaa1

And these are the only solutions missing. So one could simply, for each generated combination with length less than max_length, add 'a' (or whatever is at letters[0]) to the front of the string, and output that string, repeating if necessary. So if you generate the string 'b', you'd output:

b
ab
aab
aaab

This works, but it is unsatisfying, first because it looks like a kludged solution. Second, it does not generate the solutions in lexicographical order, which may or may not be a problem. Third, it would be really nice if the function to generate the combinations was bijective so that we knew the unique combination that results from any given decimal integer and the unique index of any combination. This will be critical in a moment.

Imagine There's No Zero

If the zero index is giving us problems, then why not do away with it? Say we change the array to:

letters = {∅,'a','b','c','1','2','3'}

where ∅ is an arbitrary placeholder that will never be used. We will now attempt to represent the decimal integers from 1 to base^max_length in the new base (in this case still 6, not 7) without using the digit zero. We'll represent the numbers from 1 to base-1 as normal (1, 2, 3, ...) but when we get to the number equal to the base, rather than represent it as 10, we'll represent it as the base digit (e.g., 6 rather than 10 in base 6). The next number, base+1, would be 11, then 12 up to 16 (which is equal to 12 decimal) and then up to 21. Each number

an,an-1,...,a1,a0

in the new base is equal to

an*bn+an-1*bn-1+...+a1*b1+a0*b0

in decimal, where b is the new base.

As I've come to learn, this is fittingly called a bijective numeration. Taking an example, in base 6 we would have:

Base 10    Base 6
1          1
2          2
...
6          6
7          11
8          12
...
11         15
12         16
13         21
...
36         56

From the Wikipedia link above, the first "digit" of the number in the new base is:

a0 = m - q0k

where k is the new base, m is the integer to convert, and q0 = ceiling(m/k)-1. Note the similarity to a normal base conversion where the only difference is that q0 would be floor(m/k) or ordinary integer division. Subsequent "digits" are computed similarly, using q0 instead of m to find a1, etc.

This formula can be broken down into two cases:

  • (m mod k) == 0: a0 = k and q0 = (m div k) - 1
  • (m mod k) != 0: a0 = (m mod k) and q0 = (m div k)

This is the heart of the algorithm. For any integer m we can iteratively apply this formula until the resulting qp == 0. Also note that the conversion is, naturally, reversible. Given a number 6666 in base 6, the decimal value is:

(6*6^3)+(6*6^2)+(6*6^1)+(6*6^0)=1554

We use this fact to find the range of integers to convert in order to get all combinations of a certain length. Sorry, but my PHP skills are non-existent. Hopefully the Java code is clear enough. Given:

    char [] letters = new char [] {'0','a','b','c','1','2','3'};
    int max_length = 4;

we have the function:

    int b = letters.length - 1;  // base to convert to
    int n = 0;
    for (int k = 0; k < max_length; k++)
        n = (n*b)+b;  // number of combinations

    int remainder;
    for (int i = 1; i <= n; i++) {
        int current = i;  // m and q_n in the formula
        String combination = "";
        do {
            remainder = current % b;
            if (remainder == 0) {
                combination += letters[b];
                current = current/b - 1;
            } else {
                combination += letters[remainder];
                current = current/b;
            }
        } while (current > 0);
        System.out.println(combination);
    }

where / represents integer division, or floor(a/b).

The function relies only on the input integer and not on the results of previous calculations, so the possibilities for parallel processing are pretty good.

Here is the output with the above input. Least significant digit is first, as per your example.

a
b
c
1
2
3
aa
ba
ca
1a
2a
3a
ab
bb
cb
1b
2b
3b
ac
bc
cc
1c
2c
3c
a1
b1
c1
11
21
31
a2
b2
c2
12
22
32
a3
b3
c3
13
23
33
aaa
baa
caa
1aa
2aa
3aa
aba
bba
cba
1ba
2ba
3ba
aca
bca
cca
1ca
2ca
3ca
a1a
b1a
c1a
11a
21a
31a
a2a
b2a
c2a
12a
22a
32a
a3a
b3a
c3a
13a
23a
33a
aab
bab
cab
1ab
2ab
3ab
abb
bbb
cbb
1bb
2bb
3bb
acb
bcb
ccb
1cb
2cb
3cb
a1b
b1b
c1b
11b
21b
31b
a2b
b2b
c2b
12b
22b
32b
a3b
b3b
c3b
13b
23b
33b
aac
bac
cac
1ac
2ac
3ac
abc
bbc
cbc
1bc
2bc
3bc
acc
bcc
ccc
1cc
2cc
3cc
a1c
b1c
c1c
11c
21c
31c
a2c
b2c
c2c
12c
22c
32c
a3c
b3c
c3c
13c
23c
33c
aa1
ba1
ca1
1a1
2a1
3a1
ab1
bb1
cb1
1b1
2b1
3b1
ac1
bc1
cc1
1c1
2c1
3c1
a11
b11
c11
111
211
311
a21
b21
c21
121
221
321
a31
b31
c31
131
231
331
aa2
ba2
ca2
1a2
2a2
3a2
ab2
bb2
cb2
1b2
2b2
3b2
ac2
bc2
cc2
1c2
2c2
3c2
a12
b12
c12
112
212
312
a22
b22
c22
122
222
322
a32
b32
c32
132
232
332
aa3
ba3
ca3
1a3
2a3
3a3
ab3
bb3
cb3
1b3
2b3
3b3
ac3
bc3
cc3
1c3
2c3
3c3
a13
b13
c13
113
213
313
a23
b23
c23
123
223
323
a33
b33
c33
133
233
333
aaaa
baaa
caaa
1aaa
2aaa
3aaa
abaa
bbaa
cbaa
1baa
2baa
3baa
acaa
bcaa
ccaa
1caa
2caa
3caa
a1aa
b1aa
c1aa
11aa
21aa
31aa
a2aa
b2aa
c2aa
12aa
22aa
32aa
a3aa
b3aa
c3aa
13aa
23aa
33aa
aaba
baba
caba
1aba
2aba
3aba
abba
bbba
cbba
1bba
2bba
3bba
acba
bcba
ccba
1cba
2cba
3cba
a1ba
b1ba
c1ba
11ba
21ba
31ba
a2ba
b2ba
c2ba
12ba
22ba
32ba
a3ba
b3ba
c3ba
13ba
23ba
33ba
aaca
baca
caca
1aca
2aca
3aca
abca
bbca
cbca
1bca
2bca
3bca
acca
bcca
ccca
1cca
2cca
3cca
a1ca
b1ca
c1ca
11ca
21ca
31ca
a2ca
b2ca
c2ca
12ca
22ca
32ca
a3ca
b3ca
c3ca
13ca
23ca
33ca
aa1a
ba1a
ca1a
1a1a
2a1a
3a1a
ab1a
bb1a
cb1a
1b1a
2b1a
3b1a
ac1a
bc1a
cc1a
1c1a
2c1a
3c1a
a11a
b11a
c11a
111a
211a
311a
a21a
b21a
c21a
121a
221a
321a
a31a
b31a
c31a
131a
231a
331a
aa2a
ba2a
ca2a
1a2a
2a2a
3a2a
ab2a
bb2a
cb2a
1b2a
2b2a
3b2a
ac2a
bc2a
cc2a
1c2a
2c2a
3c2a
a12a
b12a
c12a
112a
212a
312a
a22a
b22a
c22a
122a
222a
322a
a32a
b32a
c32a
132a
232a
332a
aa3a
ba3a
ca3a
1a3a
2a3a
3a3a
ab3a
bb3a
cb3a
1b3a
2b3a
3b3a
ac3a
bc3a
cc3a
1c3a
2c3a
3c3a
a13a
b13a
c13a
113a
213a
313a
a23a
b23a
c23a
123a
223a
323a
a33a
b33a
c33a
133a
233a
333a
aaab
baab
caab
1aab
2aab
3aab
abab
bbab
cbab
1bab
2bab
3bab
acab
bcab
ccab
1cab
2cab
3cab
a1ab
b1ab
c1ab
11ab
21ab
31ab
a2ab
b2ab
c2ab
12ab
22ab
32ab
a3ab
b3ab
c3ab
13ab
23ab
33ab
aabb
babb
cabb
1abb
2abb
3abb
abbb
bbbb
cbbb
1bbb
2bbb
3bbb
acbb
bcbb
ccbb
1cbb
2cbb
3cbb
a1bb
b1bb
c1bb
11bb
21bb
31bb
a2bb
b2bb
c2bb
12bb
22bb
32bb
a3bb
b3bb
c3bb
13bb
23bb
33bb
aacb
bacb
cacb
1acb
2acb
3acb
abcb
bbcb
cbcb
1bcb
2bcb
3bcb
accb
bccb
cccb
1ccb
2ccb
3ccb
a1cb
b1cb
c1cb
11cb
21cb
31cb
a2cb
b2cb
c2cb
12cb
22cb
32cb
a3cb
b3cb
c3cb
13cb
23cb
33cb
aa1b
ba1b
ca1b
1a1b
2a1b
3a1b
ab1b
bb1b
cb1b
1b1b
2b1b
3b1b
ac1b
bc1b
cc1b
1c1b
2c1b
3c1b
a11b
b11b
c11b
111b
211b
311b
a21b
b21b
c21b
121b
221b
321b
a31b
b31b
c31b
131b
231b
331b
aa2b
ba2b
ca2b
1a2b
2a2b
3a2b
ab2b
bb2b
cb2b
1b2b
2b2b
3b2b
ac2b
bc2b
cc2b
1c2b
2c2b
3c2b
a12b
b12b
c12b
112b
212b
312b
a22b
b22b
c22b
122b
222b
322b
a32b
b32b
c32b
132b
232b
332b
aa3b
ba3b
ca3b
1a3b
2a3b
3a3b
ab3b
bb3b
cb3b
1b3b
2b3b
3b3b
ac3b
bc3b
cc3b
1c3b
2c3b
3c3b
a13b
b13b
c13b
113b
213b
313b
a23b
b23b
c23b
123b
223b
323b
a33b
b33b
c33b
133b
233b
333b
aaac
baac
caac
1aac
2aac
3aac
abac
bbac
cbac
1bac
2bac
3bac
acac
bcac
ccac
1cac
2cac
3cac
a1ac
b1ac
c1ac
11ac
21ac
31ac
a2ac
b2ac
c2ac
12ac
22ac
32ac
a3ac
b3ac
c3ac
13ac
23ac
33ac
aabc
babc
cabc
1abc
2abc
3abc
abbc
bbbc
cbbc
1bbc
2bbc
3bbc
acbc
bcbc
ccbc
1cbc
2cbc
3cbc
a1bc
b1bc
c1bc
11bc
21bc
31bc
a2bc
b2bc
c2bc
12bc
22bc
32bc
a3bc
b3bc
c3bc
13bc
23bc
33bc
aacc
bacc
cacc
1acc
2acc
3acc
abcc
bbcc
cbcc
1bcc
2bcc
3bcc
accc
bccc
cccc
1ccc
2ccc
3ccc
a1cc
b1cc
c1cc
11cc
21cc
31cc
a2cc
b2cc
c2cc
12cc
22cc
32cc
a3cc
b3cc
c3cc
13cc
23cc
33cc
aa1c
ba1c
ca1c
1a1c
2a1c
3a1c
ab1c
bb1c
cb1c
1b1c
2b1c
3b1c
ac1c
bc1c
cc1c
1c1c
2c1c
3c1c
a11c
b11c
c11c
111c
211c
311c
a21c
b21c
c21c
121c
221c
321c
a31c
b31c
c31c
131c
231c
331c
aa2c
ba2c
ca2c
1a2c
2a2c
3a2c
ab2c
bb2c
cb2c
1b2c
2b2c
3b2c
ac2c
bc2c
cc2c
1c2c
2c2c
3c2c
a12c
b12c
c12c
112c
212c
312c
a22c
b22c
c22c
122c
222c
322c
a32c
b32c
c32c
132c
232c
332c
aa3c
ba3c
ca3c
1a3c
2a3c
3a3c
ab3c
bb3c
cb3c
1b3c
2b3c
3b3c
ac3c
bc3c
cc3c
1c3c
2c3c
3c3c
a13c
b13c
c13c
113c
213c
313c
a23c
b23c
c23c
123c
223c
323c
a33c
b33c
c33c
133c
233c
333c
aaa1
baa1
caa1
1aa1
2aa1
3aa1
aba1
bba1
cba1
1ba1
2ba1
3ba1
aca1
bca1
cca1
1ca1
2ca1
3ca1
a1a1
b1a1
c1a1
11a1
21a1
31a1
a2a1
b2a1
c2a1
12a1
22a1
32a1
a3a1
b3a1
c3a1
13a1
23a1
33a1
aab1
bab1
cab1
1ab1
2ab1
3ab1
abb1
bbb1
cbb1
1bb1
2bb1
3bb1
acb1
bcb1
ccb1
1cb1
2cb1
3cb1
a1b1
b1b1
c1b1
11b1
21b1
31b1
a2b1
b2b1
c2b1
12b1
22b1
32b1
a3b1
b3b1
c3b1
13b1
23b1
33b1
aac1
bac1
cac1
1ac1
2ac1
3ac1
abc1
bbc1
cbc1
1bc1
2bc1
3bc1
acc1
bcc1
ccc1
1cc1
2cc1
3cc1
a1c1
b1c1
c1c1
11c1
21c1
31c1
a2c1
b2c1
c2c1
12c1
22c1
32c1
a3c1
b3c1
c3c1
13c1
23c1
33c1
aa11
ba11
ca11
1a11
2a11
3a11
ab11
bb11
cb11
1b11
2b11
3b11
ac11
bc11
cc11
1c11
2c11
3c11
a111
b111
c111
1111
2111
3111
a211
b211
c211
1211
2211
3211
a311
b311
c311
1311
2311
3311
aa21
ba21
ca21
1a21
2a21
3a21
ab21
bb21
cb21
1b21
2b21
3b21
ac21
bc21
cc21
1c21
2c21
3c21
a121
b121
c121
1121
2121
3121
a221
b221
c221
1221
2221
3221
a321
b321
c321
1321
2321
3321
aa31
ba31
ca31
1a31
2a31
3a31
ab31
bb31
cb31
1b31
2b31
3b31
ac31
bc31
cc31
1c31
2c31
3c31
a131
b131
c131
1131
2131
3131
a231
b231
c231
1231
2231
3231
a331
b331
c331
1331
2331
3331
aaa2
baa2
caa2
1aa2
2aa2
3aa2
aba2
bba2
cba2
1ba2
2ba2
3ba2
aca2
bca2
cca2
1ca2
2ca2
3ca2
a1a2
b1a2
c1a2
11a2
21a2
31a2
a2a2
b2a2
c2a2
12a2
22a2
32a2
a3a2
b3a2
c3a2
13a2
23a2
33a2
aab2
bab2
cab2
1ab2
2ab2
3ab2
abb2
bbb2
cbb2
1bb2
2bb2
3bb2
acb2
bcb2
ccb2
1cb2
2cb2
3cb2
a1b2
b1b2
c1b2
11b2
21b2
31b2
a2b2
b2b2
c2b2
12b2
22b2
32b2
a3b2
b3b2
c3b2
13b2
23b2
33b2
aac2
bac2
cac2
1ac2
2ac2
3ac2
abc2
bbc2
cbc2
1bc2
2bc2
3bc2
acc2
bcc2
ccc2
1cc2
2cc2
3cc2
a1c2
b1c2
c1c2
11c2
21c2
31c2
a2c2
b2c2
c2c2
12c2
22c2
32c2
a3c2
b3c2
c3c2
13c2
23c2
33c2
aa12
ba12
ca12
1a12
2a12
3a12
ab12
bb12
cb12
1b12
2b12
3b12
ac12
bc12
cc12
1c12
2c12
3c12
a112
b112
c112
1112
2112
3112
a212
b212
c212
1212
2212
3212
a312
b312
c312
1312
2312
3312
aa22
ba22
ca22
1a22
2a22
3a22
ab22
bb22
cb22
1b22
2b22
3b22
ac22
bc22
cc22
1c22
2c22
3c22
a122
b122
c122
1122
2122
3122
a222
b222
c222
1222
2222
3222
a322
b322
c322
1322
2322
3322
aa32
ba32
ca32
1a32
2a32
3a32
ab32
bb32
cb32
1b32
2b32
3b32
ac32
bc32
cc32
1c32
2c32
3c32
a132
b132
c132
1132
2132
3132
a232
b232
c232
1232
2232
3232
a332
b332
c332
1332
2332
3332
aaa3
baa3
caa3
1aa3
2aa3
3aa3
aba3
bba3
cba3
1ba3
2ba3
3ba3
aca3
bca3
cca3
1ca3
2ca3
3ca3
a1a3
b1a3
c1a3
11a3
21a3
31a3
a2a3
b2a3
c2a3
12a3
22a3
32a3
a3a3
b3a3
c3a3
13a3
23a3
33a3
aab3
bab3
cab3
1ab3
2ab3
3ab3
abb3
bbb3
cbb3
1bb3
2bb3
3bb3
acb3
bcb3
ccb3
1cb3
2cb3
3cb3
a1b3
b1b3
c1b3
11b3
21b3
31b3
a2b3
b2b3
c2b3
12b3
22b3
32b3
a3b3
b3b3
c3b3
13b3
23b3
33b3
aac3
bac3
cac3
1ac3
2ac3
3ac3
abc3
bbc3
cbc3
1bc3
2bc3
3bc3
acc3
bcc3
ccc3
1cc3
2cc3
3cc3
a1c3
b1c3
c1c3
11c3
21c3
31c3
a2c3
b2c3
c2c3
12c3
22c3
32c3
a3c3
b3c3
c3c3
13c3
23c3
33c3
aa13
ba13
ca13
1a13
2a13
3a13
ab13
bb13
cb13
1b13
2b13
3b13
ac13
bc13
cc13
1c13
2c13
3c13
a113
b113
c113
1113
2113
3113
a213
b213
c213
1213
2213
3213
a313
b313
c313
1313
2313
3313
aa23
ba23
ca23
1a23
2a23
3a23
ab23
bb23
cb23
1b23
2b23
3b23
ac23
bc23
cc23
1c23
2c23
3c23
a123
b123
c123
1123
2123
3123
a223
b223
c223
1223
2223
3223
a323
b323
c323
1323
2323
3323
aa33
ba33
ca33
1a33
2a33
3a33
ab33
bb33
cb33
1b33
2b33
3b33
ac33
bc33
cc33
1c33
2c33
3c33
a133
b133
c133
1133
2133
3133
a233
b233
c233
1233
2233
3233
a333
b333
c333
1333
2333
3333