I have a Pandas DataFrame with a named index. I want to pass it off to a piece off code that takes a DataFrame, a column name, and some other stuff, and does a bunch of work involving that column. Only in this case the column I want to highlight is the index, but giving the index's label to this piece of code doesn't work because you can't extract an index like you can a regular column. For example, I can construct a DataFrame like this:
import pandas as pd, numpy as np
df=pd.DataFrame({'name':map(chr, range(97, 102)), 'id':range(10000,10005), 'value':np.random.randn(5)})
df.set_index('name', inplace=True)
Here's the result:
id value
name
a 10000 0.659710
b 10001 1.001821
c 10002 -0.197576
d 10003 -0.569181
e 10004 -0.882097
Now how am I allowed to go about accessing the name
column?
print(df.index) # No problem
print(df['name']) # KeyError: u'name'
I know there are workaround like duplicating the column or changing the index to something else. But is there something cleaner, like some form of column access that treats the index the same way as everything else?
Index has a special meaning in Pandas. It's used to optimise specific operations and can be used in various methods such as merging / joining data. Therefore, make a choice:
reset_index
and treat it as another column.df.index
.We can't make this choice for you. It should be dependent on the structure of your underlying data and on how you intend to analyse your data.
For more information on use of a dataframe index, see: