KeyError in Dataframe

jenryb picture jenryb · Jul 1, 2015 · Viewed 25.5k times · Source

I have a dataframe that looks just how I want it when I export it to a csv file.

CompanyName 1   2   3   4   5   6   7   8   9   10  11  12
Company 1   182 270 278 314 180 152 110 127 129 117 127 81
Company 2   163 147 192 142 186 231 214 130 112 117 93  101
Company 3   126 88  99  139 97  97  96  37  79  116 111 95
Company 4   84  89  71  95  80  89  83  88  104 93  78  64

However, when I try to pull from the key 'CompanyName' I get a KeyError: 'CompanyName' I suspect it's being overwritten somewhere but I'm not sure how to fix it.

if I print my dataframe I get:

pivot_table.head(2)
Out[62]: 
Month                  1    2    3    4    5    6    7    8    9    10   11  CompanyName                                                                   
Company 1       182  270  278  314  180  152  110  127  129  117  127   
Company 2       163  147  192  142  186  231  214  130  112  117   93   

Month                  12  
CompanyName                
Company 1        81  
Company 2       101  

which is rather hard to read to be able to tell what's going on. The code that is throwing the error:

pivot_table['CompanyName'] = [str(x) for x in pivot_table['CompanyName']]
Companies = list(pivot_table['CompanyName'])
months = ["1","2","3","4","5","6","7","8","9","10","11","12"]
pivot_table = pivot_table.set_index('CompanyName')

EDIT

Bleh's answer helped eliminate this KeyError. I needed to start the code by resetting the index, because it couldn't call a Key that had been made an index earlier.

Answer

kennes picture kennes · Jul 1, 2015

This is because you've set the index to CompanyName.

You cannot reference the index in that manner.

Use pivot_table = pivot_table.reset_index() to reset the index and try accessing it again.

Here's the reproduced error:

 In [45]: df = pd.read_clipboard()

In [46]: df
Out[46]: 
         CompanyName    1    2    3    4    5    6    7    8    9   10   11  \
Company            1  182  270  278  314  180  152  110  127  129  117  127   
Company            2  163  147  192  142  186  231  214  130  112  117   93   
Company            3  126   88   99  139   97   97   96   37   79  116  111   
Company            4   84   89   71   95   80   89   83   88  104   93   78   

          12  
Company   81  
Company  101  
Company   95  
Company   64  

In [47]: df['CompanyName']
Out[47]: 
Company    1
Company    2
Company    3
Company    4
Name: CompanyName, dtype: int64

In [48]: df = df.set_index('CompanyName')

In [49]: df['CompanyName']
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-49-d5b597a2bc80> in <module>()
----> 1 df['CompanyName']

/Library/Python/2.7/site-packages/pandas-0.16.1-py2.7-macosx-10.10-intel.egg/pandas/core/frame.pyc in __getitem__(self, key)
   1789             return self._getitem_multilevel(key)
   1790         else:
-> 1791             return self._getitem_column(key)
   1792 
   1793     def _getitem_column(self, key):

/Library/Python/2.7/site-packages/pandas-0.16.1-py2.7-macosx-10.10-intel.egg/pandas/core/frame.pyc in _getitem_column(self, key)
   1796         # get column
   1797         if self.columns.is_unique:
-> 1798             return self._get_item_cache(key)
   1799 
   1800         # duplicate columns & possible reduce dimensionaility

/Library/Python/2.7/site-packages/pandas-0.16.1-py2.7-macosx-10.10-intel.egg/pandas/core/generic.pyc in _get_item_cache(self, item)
   1082         res = cache.get(item)
   1083         if res is None:
-> 1084             values = self._data.get(item)
   1085             res = self._box_item_values(item, values)
   1086             cache[item] = res

/Library/Python/2.7/site-packages/pandas-0.16.1-py2.7-macosx-10.10-intel.egg/pandas/core/internals.pyc in get(self, item, fastpath)
   2849 
   2850             if not isnull(item):
-> 2851                 loc = self.items.get_loc(item)
   2852             else:
   2853                 indexer = np.arange(len(self.items))[isnull(self.items)]

/Library/Python/2.7/site-packages/pandas-0.16.1-py2.7-macosx-10.10-intel.egg/pandas/core/index.pyc in get_loc(self, key, method)
   1576         """
   1577         if method is None:
-> 1578             return self._engine.get_loc(_values_from_object(key))
   1579 
   1580         indexer = self.get_indexer([key], method=method)

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3824)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:3704)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12349)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12300)()

KeyError: 'CompanyName'

Correction Output:

In [50]: df = df.reset_index()

In [51]: df['CompanyName']
Out[51]: 
0    1
1    2
2    3
3    4
Name: CompanyName, dtype: int64