Applying strptime function to pandas series

thron of three picture thron of three · Aug 5, 2016 · Viewed 14.5k times · Source

I have a pandas DataSeries that contains a string formatted date in the form of:

2016-01-14 11:39:54

I would like to convert the string to a timestamp.

I am using the apply method to attemp to pass 'datetime.strptime' to each element of the series

date_series = date_string.apply(datetime.strptime, args=('%Y-%m-%d %H:%M:%S'))

When I run the code, I get the following error:

strptime() takes exactly 2 arguments (18 given)

my questions are (1) am I taking the correct approach, (2) why is strptime converting my args into 18 arguments?

Answer

root picture root · Aug 5, 2016

Use pd.to_datetime:

date_series = pd.to_datetime(date_string)

In general it's best have your dates as Pandas' pd.Timestamp instead of Python's datetime.datetime if you plan to do your work in Pandas. You may also want to review the Time Series / Date functionality documentation.

As to why your apply isn't working, args isn't being read as a tuple, but rather as a string that's being broken up into 17 characters, each being interpreted as a separate argument. To make it be read as a tuple, add a comma: args=('%Y-%m-%d %H:%M:%S',).

This is standard behaviour in Python. Consider the following example:

x = ('a')
y = ('a',)
print('x info:', x, type(x))
print('y info:', y, type(y))

x info: a <class 'str'>
y info: ('a',) <class 'tuple'>