I have a df (Apple_farm
) and need to calculate a percentage based off values found in two of the columns (Good_apples
and Total_apples
) and then add the resulting values to a new column within Apple_farm called 'Perc_Good'.
I have tried:
Apple_farm['Perc_Good'] = (Apple_farm['Good_apples'] / Apple_farm['Total_apples']) *100
However this results in this error:
TypeError: unsupported operand type(s) for /: 'str' and 'str'
Doing
Print Apple_farm['Good_apples']
and Print Apple_farm['Total_apples']
Yields a list with numerical values however dividing them seems to result in them being converted to strings?
I have also tried to define a new function:
def percentage(amount, total):
percent = amount/total*100
return percent
but are unsure on how to use this.
Any help would be appreciated as I am fairly new to Python and pandas!
I think you need convert string
columns to float
or int
, because their type
is string
(but looks like numbers):
Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(float)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(float)
Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(int)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(int)
Sample:
import pandas as pd
Good_apples = ["10", "20", "3", "7", "9"]
Total_apples = ["20", "80", "30", "70", "90"]
d = {"Good_apples": Good_apples, "Total_apples": Total_apples}
Apple_farm = pd.DataFrame(d)
print Apple_farm
Good_apples Total_apples
0 10 20
1 20 80
2 3 30
3 7 70
4 9 90
print Apple_farm.dtypes
Good_apples object
Total_apples object
dtype: object
print Apple_farm.at[0,'Good_apples']
10
print type(Apple_farm.at[0,'Good_apples'])
<type 'str'>
Apple_farm['Good_apples'] = Apple_farm['Good_apples'].astype(int)
Apple_farm['Total_apples'] = Apple_farm['Total_apples'].astype(int)
print Apple_farm.dtypes
Good_apples int32
Total_apples int32
dtype: object
print Apple_farm.at[0,'Good_apples']
10
print type(Apple_farm.at[0,'Good_apples'])
<type 'numpy.int32'>
Apple_farm['Perc_Good'] = (Apple_farm['Good_apples'] / Apple_farm['Total_apples']) *100
print Apple_farm
Good_apples Total_apples Perc_Good
0 10 20 50.0
1 20 80 25.0
2 3 30 10.0
3 7 70 10.0
4 9 90 10.0