Join two spreadsheets on a common column in Excel or OpenOffice

Steven Smethurst picture Steven Smethurst · Nov 12, 2010 · Viewed 70.2k times · Source

I have two CSV files with a common column and I want to "Join" the tables together on the common column.

For example: Join 'A' with 'B' equals 'Result'. If a one table has a key value that does not exist on in the other table its just left as blank.

== Table A ==        == Table B ==        == Table result ==
Name  ,Age           Name  ,Sex           Name ,Age ,Sex
Bob   ,37     +      Bob   ,Male     =>   Bob  ,37  ,Male
Steve ,12            Steve ,Male          Steve,12  ,Male
Kate  , 7                                 Kate , 7  , 
                     Sara  ,Female        Sara ,    ,Female 

I know how to do this with an SQL database but I have never done it with "Excel" or "OpenOffice.org Calc"

Suggestions?

Answer

robert picture robert · Nov 12, 2010

In Excel, vlookup can do part of what you're asking. Specifically, you can use vlookup to do a left or right outer join, but not a full outer join (like your table result).

To do an outer join for your example above, add the following to the C2 of "Table B" (or copy "Table B" and then do this):

=vlookup(
    a2, # the cell value from the current table to look up in the other table
    table_a!$1:$174832718, # the other table
                           # don't manually type this--select the entire 
                           # other table while the cursor is editing this
                           # cell, then add the "$"s--Excel doesn't
                           # automatically add them
                           # (the syntax here is for different sheets in
                           # the same file, but Excel will fill this in 
                           # correctly for different files as well)
    2, # the column to get from the other table (A=1, B=2, etc.)
    FALSE) # FALSE=only get exact matches TRUE=find approx. matches if no exact match

You should then be able to expand it to deal with multiple rows and multiple imported columns.