PrinceOfToe
PrinceOfToe

Reputation: 1

R row selection providing partial results

I'm having an issue, which I have found a solution for, but would like to understand what was going on in the original coding.

So I started with a table pulled from an SQL database and wanted information for 1 client, who is covered by 2 client numbers.

Originally I was running this to select those account numbers.

match <- c("C524",'5568')
gtc <- gtc[gtc$AccountNumber == match,]

However this was only returning about half of the desired results, and the results returned vary at different times (this was running as a weekly report), and depending on the PC running it.

Now, I've set up a loop which works fine and extracts all the results, but would really like to know what was going on with the original query.

match <- c("C524",'5568')
for (each in match) {
  gtcLoop<- gtc[gtc$AccountNumber == each,]
  result<-rbind(result,gtcLoop)
}

Also, long time lurker, first time poster so let me know if I've done anything wrong in this question.

Upvotes: 0

Views: 36

Answers (2)

C8H10N4O2
C8H10N4O2

Reputation: 18995

Just to tag onto Qaswed's answer (+1), you need to understand what is happening when you compute vector comparisons like ==. See:

?`==`

and

?`%in%`

then try something like 1 == c(1,2) and 1 %in% c(1,2).

The reason you are getting half the results is because the row subset is using the first evaluation only, as in:

df <- data.frame(id=c(1:5), acct_cd = letters[1:5])
df[df$acct_cd == c("a","c"),] # this is wrong, for demo only
df[df$acct_cd %in% c("a","c"),] # this is correct

Upvotes: 0

Qaswed
Qaswed

Reputation: 3879

You need to replace == by %in%:

gtc <- data.frame(AccountNumber = sample(c(match, "something"), 10, replace = TRUE))

gtc[gtc$AccountNumber %in% match,]

Upvotes: 2

Related Questions