Reputation: 63
If I have a dataframe of election results by district and candidate, is there an easy way to find the winner in each district in R? That is, for each row, select both the maximum value and the column name for that maximum value?
District CandidateA CandidateB CandidateC
1 702 467 35
2 523 642 12
...
So I'd want to select not only 702 in row 1 and 642 in row 2, but also "CandidateA" from row 1 and "CandidateB" in row 2.
I'm asking this as a learning question, as I know I can do this with any general-purpose scripting language like Perl or Ruby. Perhaps R isn't the tool for this, but it seems like it could be. Thank you.
Upvotes: 1
Views: 1266
Reputation: 226182
d <- read.table(textConnection(
"District CandidateA CandidateB CandidateC
1 702 467 35
2 523 642 12"),
header=TRUE)
d2 <- d[,-1] ## drop district number
data.frame(winner=names(d2)[apply(d2,1,which.max)],
votes=apply(d2,1,max))
result:
winner votes
1 CandidateA 702
2 CandidateB 642
Do you need to worry about ties? See the help for which
and which.max
, they treat ties differently ...
Upvotes: 5
Reputation: 1594
If this isn't too messy, you can try running a for
loop and printing out the results using cat
. So if your data.frame
object is x
:
for(i in 1:length(x$District)) {
row <- x[i,]
max_row <- max(row[2:length(row)])
winner_row <- names(x)[which(row==max_row)]
cat(winner_row, max_row, "\n")
}
Upvotes: 1