Datetime strptime in Python pandas : what's wrong?

Fagui Curtain picture Fagui Curtain · May 5, 2016 · Viewed 38.3k times · Source
import datetime as datetime
datetime.strptime('2013-01-01 09:10:12', '%Y-%m-%d %H:%M:%S')

produces

AttributeError Traceback (most recent call last) in () 1 import datetime as datetime ----> 2 datetime.strptime('2013-01-01 09:10:12', '%Y-%m-%d %H:%M:%S') 3 z = minidf['Dates'] 4 z

AttributeError: 'module' object has no attribute 'strptime'

my goal is to convert a pandas dataframe column whose format is still a data object

import datetime as datetime
#datetime.strptime('2013-01-01 09:10:12', '%Y-%m-%d %H:%M:%S')
z = minidf['Dates']

0     2015-05-13 23:53:00
1     2015-05-13 23:53:00
2     2015-05-13 23:33:00
3     2015-05-13 23:30:00
4     2015-05-13 23:30:00
5     2015-05-13 23:30:00
6     2015-05-13 23:30:00
7     2015-05-13 23:30:00
8     2015-05-13 23:00:00
9     2015-05-13 23:00:00
10    2015-05-13 22:58:00
Name: Dates, dtype: object

the bonus question is, i got this column using pd.read_csv function from a larger file with more columns. Is it possible to pass parameters such that pd.read_csv directly converts this to dtype: datetime64[ns] format

Answer

jezrael picture jezrael · May 5, 2016

I think you can use for converting to_datetime:

print pd.to_datetime('2013-01-01 09:10:12', format='%Y-%m-%d %H:%M:%S')
2013-01-01 09:10:12

print pd.to_datetime('2013-01-01 09:10:12')
2013-01-01 09:10:12

If you need convert in function read_csv, add parameter parse_dates:

df = pd.read_csv('filename',  parse_dates=['Dates'])

Sample:

import pandas as pd
import io

temp=u"""Dates
2015-05-13 23:53:00
2015-05-13 23:53:00
2015-05-13 23:33:00
2015-05-13 23:30:00
2015-05-13 23:30:00
2015-05-13 23:30:00
2015-05-13 23:30:00
2015-05-13 23:30:00
2015-05-13 23:00:00
2015-05-13 23:00:00
2015-05-13 22:58:00
"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp),  parse_dates=['Dates'])
print df
                 Dates
0  2015-05-13 23:53:00
1  2015-05-13 23:53:00
2  2015-05-13 23:33:00
3  2015-05-13 23:30:00
4  2015-05-13 23:30:00
5  2015-05-13 23:30:00
6  2015-05-13 23:30:00
7  2015-05-13 23:30:00
8  2015-05-13 23:00:00
9  2015-05-13 23:00:00
10 2015-05-13 22:58:00

print df.dtypes
Dates    datetime64[ns]
dtype: object

Another solution with to_datetime:

print pd.to_datetime(df['Dates'])

Sample:

print df
                  Dates
0   2015-05-13 23:53:00
1   2015-05-13 23:53:00
2   2015-05-13 23:33:00
3   2015-05-13 23:30:00
4   2015-05-13 23:30:00
5   2015-05-13 23:30:00
6   2015-05-13 23:30:00
7   2015-05-13 23:30:00
8   2015-05-13 23:00:00
9   2015-05-13 23:00:00
10  2015-05-13 22:58:00

print df.dtypes
Dates    object

df['Dates'] = pd.to_datetime(df['Dates'])
print df
                 Dates
0  2015-05-13 23:53:00
1  2015-05-13 23:53:00
2  2015-05-13 23:33:00
3  2015-05-13 23:30:00
4  2015-05-13 23:30:00
5  2015-05-13 23:30:00
6  2015-05-13 23:30:00
7  2015-05-13 23:30:00
8  2015-05-13 23:00:00
9  2015-05-13 23:00:00
10 2015-05-13 22:58:00

print df.dtypes
Dates    datetime64[ns]
dtype: object