Example dataset:
>>> df
ID Region count
0 100 Asia 2
1 101 Europe 3
2 102 US 1
3 103 Africa 5
4 100 Russia 5
5 101 Australia 7
6 102 US 8
7 104 Asia 10
8 105 Europe 11
9 110 Africa 23
I want to group the observations of this dataset by ID
and Region
and summing the count
for each group. So I used something like this...
>>> print(df.groupby(['ID','Region'],as_index=False).count().sum())
ID Region count
0 100 Asia 2
1 100 Russia 5
2 101 Australia 7
3 101 Europe 3
4 102 US 9
5 103 Africa 5
6 104 Asia 10
7 105 Europe 11
8 110 Africa 23
On using as_index=False
I am able to get "SQL-Like" output. My problem is that I am unable to rename the aggregate variable count
here. So in SQL if wanted to do the above thing I would do something like this:
select ID, Region, sum(count) as Total_Numbers
from df
group by ID, Region
order by ID, Region
As we see, it's very easy for me to rename the aggregate variable count
to Total_Numbers
in SQL. I wanted to do the same thing in Pandas but unable to find such an option in group-by function. Can somebody help?
The second question (more of an observation) is whether...
I understand that the variable names are strings, so have to be inside quotes, but I see if use them outside dataframe function and as an attribute we don't require them to be inside quotes. Like df.ID.sum()
etc. It's only when we use it in a DataFrame function like df.sort()
or df.groupby
we have to use it inside quotes. This is actually a bit of pain as in SQL or in SAS or other languages we simply use the variable name without quoting them. Any suggestion on this?
Kindly reply to both questions (Q1 is the main, Q2 more of an opinion).
For the first question I think answer would be:
<your DataFrame>.rename(columns={'count':'Total_Numbers'})
or
<your DataFrame>.columns = ['ID', 'Region', 'Total_Numbers']
As for second one I'd say the answer would be no. It's possible to use it like 'df.ID' because of python datamodel:
Attribute references are translated to lookups in this dictionary, e.g., m.x is equivalent to m.dict["x"]