EDIT:
If you're coming to this question and your string looks like 1996-Q1
, then just use pd.to_datetime(df['Quarter'])
to convert it to a proper pandas datetime. This question is about solving all the quarter dates that are not in this standard format.
ORIGINAL QUESTION:
I'm looking for a nice, readable and understandable way (one that you can remember for the next time) to convert Q3 1996
to a pandas datetime, for example 1996-07-01
in this case.
Until now I found this, but it's mighty ugly:
df = pd.DataFrame({'Quarter':['Q3 1996', 'Q4 1996', 'Q1 1997']})
df['date'] = (
pd.to_datetime(
df['Quarter'].str.split(' ').apply(lambda x: ''.join(x[::-1]))
))
print(df)
Quarter date
0 Q3 1996 1996-07-01
1 Q4 1996 1996-10-01
2 Q1 1997 1997-01-01
I was hoping the following would work, because it's readable, but unfortunately it doesn't:
df['date'] = pd.to_datetime(df['Quarter'], format='%q %Y')
The problem is also that quarter and year are apparently in the wrong order for pandas to do simple processing.
Can anyone help me find a cleaner way of converting Q3 1996
to a pandas datetime?
You can (and should) use pd.PeriodIndex
as a first step, then convert to timestamp using PeriodIndex.to_timestamp
:
qs = df['Quarter'].str.replace(r'(Q\d) (\d+)', r'\2-\1')
qs
0 1996-Q3
1 1996-Q4
2 1997-Q1
Name: Quarter, dtype: object
df['date'] = pd.PeriodIndex(qs, freq='Q').to_timestamp()
df
Quarter date
0 Q3 1996 1996-07-01
1 Q4 1996 1996-10-01
2 Q1 1997 1997-01-01
The initial replace step is necessary as PeriodIndex
expects your periods in the %Y-%q
format.
Another option is to use pd.to_datetime
after performing string replacement in the same way as before.
df['date'] = pd.to_datetime(
df['Quarter'].str.replace(r'(Q\d) (\d+)', r'\2-\1'), errors='coerce')
df
Quarter date
0 Q3 1996 1996-07-01
1 Q4 1996 1996-10-01
2 Q1 1997 1997-01-01
If performance is important, you can split and join, but you can do it cleanly:
df['date'] = pd.to_datetime([
'-'.join(x.split()[::-1]) for x in df['Quarter']])
df
Quarter date
0 Q3 1996 1996-07-01
1 Q4 1996 1996-10-01
2 Q1 1997 1997-01-01