I'm trying to inner join DataFrame A to DataFrame B and am running into an error.
Here's my join statement:
merged = DataFrameA.join(DataFrameB, on=['Code','Date'])
And here's the error:
ValueError: len(left_on) must equal the number of levels in the index of "right"
I'm not sure the column order matters (they aren't truly "ordered" are they?), but just in case, the DataFrames are organized like this:
DataFrameA: Code, Date, ColA, ColB, ColC, ..., ColG, ColH (shape: 80514, 8 - no index)
DataFrameB: Date, Code, Col1, Col2, Col3, ..., Col15, Col16 (shape: 859, 16 - no index)
Do I need to correct my join statement? Or is there another, better way to get the intersection (or inner join) of these two DataFrames?
use merge
if you are not joining on the index:
merged = pd.merge(DataFrameA,DataFrameB, on=['Code','Date'])
Follow up to question below:
Here is a reproducible example:
import pandas as pd
# create some timestamps for date column
i = pd.to_datetime(pd.date_range('20140601',periods=2))
#create two dataframes to merge
df = pd.DataFrame({'code': ['ABC','EFG'], 'date':i,'col1': [10,100]})
df2 = pd.DataFrame({'code': ['ABC','EFG'], 'date':i,'col2': [10,200]})
#merge on columns (default join is inner)
pd.merge(df, df2, on =['code','date'])
This results is:
code col1 date col2
0 ABC 10 2014-06-01 10
1 EFG 100 2014-06-02 200
What happens when you run this code?