I have a very large dataframe with rows as observations and columns as genetic markers. I would like to create a new column that contains the sum of a select number of columns for each observation using R.
If I have 200 columns and 100 rows, I would like a to create a new column that has 100 rows with the sum of say columns 43 through 167. The columns have either 1 or 0. With the new column that contains the sum of each row, I will be able to sort the individuals who have the most genetic markers.
I feel it is something close to:
data$new=sum(data$[,43:167])
you can use rowSums
rowSums(data)
should give you what you want.