CodeEnthusiast
CodeEnthusiast

Reputation: 63

Select a column by criteria *and* the column name in each row of an R dataframe?

If I have a dataframe of election results by district and candidate, is there an easy way to find the winner in each district in R? That is, for each row, select both the maximum value and the column name for that maximum value?

District   CandidateA   CandidateB   CandidateC
1          702          467          35
2          523          642          12
...

So I'd want to select not only 702 in row 1 and 642 in row 2, but also "CandidateA" from row 1 and "CandidateB" in row 2.

I'm asking this as a learning question, as I know I can do this with any general-purpose scripting language like Perl or Ruby. Perhaps R isn't the tool for this, but it seems like it could be. Thank you.

Upvotes: 1

Views: 1266

Answers (2)

Ben Bolker
Ben Bolker

Reputation: 226182

d <- read.table(textConnection(
"District   CandidateA   CandidateB   CandidateC
1          702          467          35
2          523          642          12"),
header=TRUE)                

d2 <- d[,-1]  ## drop district number
data.frame(winner=names(d2)[apply(d2,1,which.max)],
           votes=apply(d2,1,max))

result:

      winner votes
1 CandidateA   702
2 CandidateB   642

Do you need to worry about ties? See the help for which and which.max, they treat ties differently ...

Upvotes: 5

oeo4b
oeo4b

Reputation: 1594

If this isn't too messy, you can try running a for loop and printing out the results using cat. So if your data.frame object is x:

for(i in 1:length(x$District)) {
  row <- x[i,]
  max_row <- max(row[2:length(row)])
  winner_row <- names(x)[which(row==max_row)]
  cat(winner_row, max_row, "\n")
}

Upvotes: 1

Related Questions