Is it possible to reindex a pandas DataFrame
using a column made up of datetime objects?
I have a DataFrame df
with the following columns:
Int64Index: 19610 entries, 0 to 19609
Data columns:
cntr 19610 non-null values #int
datflt 19610 non-null values #float
dtstamp 19610 non-null values #datetime object
DOYtimestamp 19610 non-null values #float
dtypes: int64(1), float64(2), object(1)
I can reindex the df
easily along DOYtimestamp
with: df.reindex(index=df.dtstamp)
and DOYtimestamp
has the following values:
>>> df['DOYtimestamp'].values
array([ 153.76252315, 153.76253472, 153.7625463 , ..., 153.98945602,
153.98946759, 153.98947917])
but I'd like to reindex the DataFrame along dtstamp
which is made up of datetime objects so that I generate different timestamps directly from the index. The dtstamp
column has values which look like:
>>> df['dtstamp'].values
array([2012-06-02 18:18:02, 2012-06-02 18:18:03, 2012-06-02 18:18:04, ...,
2012-06-02 23:44:49, 2012-06-02 23:44:50, 2012-06-02 23:44:51],
dtype=object)
When I try and reindex df
along dtstamp
I get the following:
>>> df.reindex(index=df.dtstamp)
TypeError: can't compare datetime.datetime to long
I'm just not sure what I need to do get the index to be of a datetime type. Any thoughts?
It sounds like you don't want reindex. Somewhat confusingly reindex
is not for defining a new index, exactly; rather, it looks for rows that have the specified indices. So if you have a DataFrame with index [0, 1, 2]
, then doing a reindex([2, 1, 0])
will return the rows in reverse order. Doing something like reindex([8, 9, 10])
does not make a new index for the rows; rather, it will return a DataFrame with NaN
values, since there are no rows with indices 8, 9, or 10.
It seems like what you want is to just keep the same rows, but make a totally new index for them. For that you can just assign to the index directly. So try doing df.index = df['dtstamp']
.