Ed G
Ed G

Reputation: 822

recode NA in factors

I've got a data frame containing groups of sample points:

samplePoint<-c("1","1","1","1","2","2","2","2","3","3","3","3")
category<-c("a", "a", "a", NA, "b", "b", NA, "b", NA, "a", "a", "a")
values<-c(0.51, 0.21, 0.31, 0.22, 0.61, 0.71, 0.52, 0.32, 0.23, 0.1, 0.24, 0.33)
dat<-data.frame(samplePoint, category, values)

I need to recode the NAs in dat$category for something later on in the process. Each sample point will only have one category: 1 should all be "a", 2 = "b" and 3 ="a".

I've tried aggregate using an ifelse function, intending to recode using a match or lookup type function:

codeList<-aggregate(
dat$category, by=list(dat$samplePoint),
FUN=function(x){ifelse(length(which(x=="a")) > length(which(x=="b")), "a", "b")}
)

Question 1 is, how do I tackle the matching? Question 2 is have I totally overcomplicated the whole thing?

Thanks for your help.

Upvotes: 3

Views: 580

Answers (1)

James
James

Reputation: 66834

Q1: you don't, because, Q2: yes, massively.

What you can do is use call factor on your sample points appropriately transformed and with the required labels.

category <- factor((as.numeric(samplePoint)+1)%%2,labels=letters[1:2])
category
 [1] a a a a b b b b a a a a
Levels: a b

The transformation uses the modulus operator (%%) to convert the sample points to binary output, but the points are shifted to make the points 1 and 3 correspond to the label "a". Any further points would be coded in the same way, ie 4: "b", 5: "a".

Update

After getting the clarification in the comment, I think this might help:

(catTable <- aggregate(category,list(samplePoint=samplePoint),function(x) unique(x[!is.na(x)])))
  samplePoint x
1           1 a
2           2 b
3           3 a

This gives you a data.frame which you can merge with your original data to get what you want.

merge(dat,catTable,all.x=T)
   samplePoint category values x
1            1        a   0.51 a
2            1        a   0.21 a
3            1        a   0.31 a
4            1     <NA>   0.22 a
5            2        b   0.61 b
6            2        b   0.71 b
7            2     <NA>   0.52 b
8            2        b   0.32 b
9            3     <NA>   0.23 a
10           3        a   0.10 a
11           3        a   0.24 a
12           3        a   0.33 a

Upvotes: 1

Related Questions