I have dataset, df.head(4)
:
Dewptm Fog Humidity Pressurem Tempm Wspdm Rainfall
datetime_utc
1996-11-01 11.666667 0.0 52.916667 -2659.666667 22.333333 2.466667 0
1996-11-02 10.458333 0.0 48.625000 1009.833333 22.916667 8.028571 0
1996-11-03 12.041667 0.0 55.958333 1010.500000 21.791667 4.804545 0
1996-11-04 10.222222 0.0 48.055556 1011.333333 22.722222 1.964706 0
Here is df.columns
:
Index(['Dewptm', 'Fog', 'Humidity', 'Pressurem', 'Rain', 'Tempm', 'Wspdm',
'Rainfall'],
dtype='object')
How could I split datetime_utc
column into the year, month and day column?
I tried:
df["day"] = df['datetime_utc'].map(lambda x: x.day)
df["month"] = df['datetime_utc'].map(lambda x: x.month)
df["year"] = df['datetime_utc'].map(lambda x: x.year)
Error:
KeyError: 'datetime_utc'
Also
pd.concat([df.drop('datetime_utc', axis = 1),
(df.datetime_utc.str.split("-).str[:3].apply(pd.Series)
.rename(columns={0:'year', 1:'month', 2:'day'}))], axis = 1)
I am getting error:
KeyError: "['datetime_utc'] not found in axis" The problem I am facing is the column
datetime_utc
is the default index column in my dataset, Please suggest me an approach.
The problem is that datetime_utc
is in your index instead a column, so you have to access your index to be able to make your new columns:
df['day'] = df.index.day
df['month'] = df.index.month
df['year'] = df.index.year
print(df)
Dewptm Fog Humidity Pressurem Tempm Wspdm \
datetime_utc
1996-11-01 11.666667 0.0 52.916667 -2659.666667 22.333333 2.466667
1996-11-02 10.458333 0.0 48.625000 1009.833333 22.916667 8.028571
1996-11-03 12.041667 0.0 55.958333 1010.500000 21.791667 4.804545
1996-11-04 10.222222 0.0 48.055556 1011.333333 22.722222 1.964706
Rainfall day month year
datetime_utc
1996-11-01 0 1 11 1996
1996-11-02 0 2 11 1996
1996-11-03 0 3 11 1996
1996-11-04 0 4 11 1996
If you want datetime_utc
as a column you have to reset your index and then you can access the datetime methods with dt.month
, dt.year
and dt.day
like following:
# Reset our index so datetime_utc becomes a column
df.reset_index(inplace=True)
# Create new columns
df['day'] = df['datetime_utc'].dt.day
df['month'] = df['datetime_utc'].dt.month
df['year'] = df['datetime_utc'].dt.year
print(df)
datetime_utc Dewptm Fog Humidity Pressurem Tempm Wspdm \
0 1996-11-01 11.666667 0.0 52.916667 -2659.666667 22.333333 2.466667
1 1996-11-02 10.458333 0.0 48.625000 1009.833333 22.916667 8.028571
2 1996-11-03 12.041667 0.0 55.958333 1010.500000 21.791667 4.804545
3 1996-11-04 10.222222 0.0 48.055556 1011.333333 22.722222 1.964706
Rainfall day month year
0 0 1 11 1996
1 0 2 11 1996
2 0 3 11 1996
3 0 4 11 1996
Note if your index is not in datetime
type yet, use the following before you try to extract year, month and day:
df.index = pd.to_datetime(df.index)