I want to have a pandas DataFrame with a timestamp column and want to create a column with just the month. I want to have the month column with string representations of the month, not with integers. I have done something like this:
df['Dates'] = pd.to_datetime(df['Dates'])
df['Month'] = df.Dates.dt.month
df['Month'] = df.Month.apply(lambda x: datetime.strptime(str(x), '%m').strftime('%b'))
However, this is some kind of a brute force approach and not very performant. Is there a more elegant way to convert the integer representation of the month into a string representation?
use vectorised dt.strftime
on your datetimes:
In [43]:
df = pd.DataFrame({'dates':pd.date_range(dt.datetime(2016,1,1), dt.datetime(2017,2,1), freq='M')})
df
Out[43]:
dates
0 2016-01-31
1 2016-02-29
2 2016-03-31
3 2016-04-30
4 2016-05-31
5 2016-06-30
6 2016-07-31
7 2016-08-31
8 2016-09-30
9 2016-10-31
10 2016-11-30
11 2016-12-31
12 2017-01-31
In [44]:
df['month'] = df['dates'].dt.strftime('%b')
df
Out[44]:
dates month
0 2016-01-31 Jan
1 2016-02-29 Feb
2 2016-03-31 Mar
3 2016-04-30 Apr
4 2016-05-31 May
5 2016-06-30 Jun
6 2016-07-31 Jul
7 2016-08-31 Aug
8 2016-09-30 Sep
9 2016-10-31 Oct
10 2016-11-30 Nov
11 2016-12-31 Dec
12 2017-01-31 Jan