Non-redundant version of expand.grid

Michele picture Michele · Jun 18, 2013 · Viewed 12.9k times · Source

The R function expand.grid returns all possible combination between the elements of supplied parameters. e.g.

> expand.grid(c("aa", "ab", "cc"), c("aa", "ab", "cc"))
  Var1 Var2
1   aa   aa
2   ab   aa
3   cc   aa
4   aa   ab
5   ab   ab
6   cc   ab
7   aa   cc
8   ab   cc
9   cc   cc

Do you know an efficient way to get directly (so without any row comparison after expand.grid) only the 'unique' combinations between the supplied vectors? The output will be

  Var1 Var2
1   aa   aa
2   ab   aa
3   cc   aa
5   ab   ab
6   cc   ab
9   cc   cc

EDIT the combination of each element with itself could be eventually discarded from the answer. I don't actually need it in my program even though (mathematically) aa aa would be one (regular) unique combination between one element of Var1 and another of var2.

The solution needs to produce pairs of elements from both vectors (i.e. one from each of the input vectors - so that it could be applied to more than 2 inputs)

Answer

Simon O'Hanlon picture Simon O'Hanlon · Jun 18, 2013

How about using outer? But this particular function concatenates them into one character string.

outer( c("aa", "ab", "cc"), c("aa", "ab", "cc") , "paste" )
#     [,1]    [,2]    [,3]   
#[1,] "aa aa" "aa ab" "aa cc"
#[2,] "ab aa" "ab ab" "ab cc"
#[3,] "cc aa" "cc ab" "cc cc"

You can also use combn on the unique elements of the two vectors if you don't want the repeating elements (e.g. aa aa)

vals <- c( c("aa", "ab", "cc"), c("aa", "ab", "cc") )
vals <- unique( vals )
combn( vals , 2 )
#     [,1] [,2] [,3]
#[1,] "aa" "aa" "ab"
#[2,] "ab" "cc" "cc"