I have the following dataframe called language
lang level
0 english intermediate
1 spanish intermediate
2 spanish basic
3 english basic
4 english advanced
5 spanish intermediate
6 spanish basic
7 spanish advanced
I categorized each of my variables into numbers by using
language.lang.astype('category').cat.codes
and
language.level.astype('category').cat.codes
respectively. Obtaining the following data frame:
lang level
0 0 1
1 1 1
2 1 0
3 0 0
4 0 2
5 1 1
6 1 0
7 1 2
Now, I would like to know if there is a way to obtain which original value corresponds to each value. I'd like to know that the 0
value in the lang
column corresponds to english and so on.
Is there any function that allows me to get back this information?
You can generate dictionary:
c = language.lang.astype('category')
d = dict(enumerate(c.cat.categories))
print (d)
{0: 'english', 1: 'spanish'}
So then if necessary is possible map
:
language['code'] = language.lang.astype('category').cat.codes
language['level_back'] = language['code'].map(d)
print (language)
lang level code level_back
0 english intermediate 0 english
1 spanish intermediate 1 spanish
2 spanish basic 1 spanish
3 english basic 0 english
4 english advanced 0 english
5 spanish intermediate 1 spanish
6 spanish basic 1 spanish
7 spanish advanced 1 spanish