Charlie
Charlie

Reputation: 15

Using ifelse statement to condense variables

New to R, taking a very accelerated class with very minimal instruction. So I apologize in advance if this is a rookie question.

The assignment I have is to take a specific column that has 21 levels from a dataframe, and condense them into 4 levels, using an if, or ifelse statement. I've tried what feels like hundreds of combinations, but this is the code that seemed most promising:

> b2$LANDFORM=ifelse(b2$LANDFORM=="af","af_type",
        ifelse(b2$LANDFORM=="aflb","af_type",
        ifelse(b2$LANDFORM=="afub","af_type",
        ifelse(b2$LANDFORD=="afwb","af_type",
        ifelse(b2$LANDFORM=="afws","af_type",
        ifelse(b2$LANDFORM=="bfr","bf_type",
        ifelse(b2$LANDFORM=="bfrlb","bf_type",
        ifelse(b2$LANDFORM=="bfrwb","bf_type",
        ifelse(b2$LANDFORM=="bfrwbws","bf_type",
        ifelse(b2$LANDFORM=="bfrws","bf_type",
        ifelse(b2$LANDFORM=="lb","lb_type",
        ifelse(bs$LANDFORM=="lbaf","lb_type",
        ifelse(b2$LANDFORM=="lbub","lb_type",
        ifelse(b2$LANDFORM=="lbwb","lb_type","ws_type"))))))))))))))

LANDFORM is a factor, but I tried changing it to a character too, and the code still didn't work.

"ws_type" is the catch all for the remaining variables.

the code runs without errors, but when I check it, all I get is:

> unique(b2$LANDFORM)

[1] NA "af_type"

Am I even on the right path? Any suggestions? Should I bite the bullet and make a new column with substr()? Thanks in advance.

Upvotes: 0

Views: 530

Answers (2)

Charlie
Charlie

Reputation: 15

After a great deal of experimenting, I consulted a co-worker, and he was able to simplify a huge amount of this. Basically, I should have made a new column composed of the first two letters of the variables in LANDFORM, and then sample from that new column and replace values in LANDFORM, in order to make the ifelse() statement much shorter. The code is:

> b2$index=as.factor(substring(b2$LANDFORM,1,2))

b2$LANDFORM=ifelse(b2$index=="af","af_type",
ifelse(b2$index=="bf","bf_type",
ifelse(b2$index=="lb","lb_type",
ifelse(b2$index=="wb","wb_type",
ifelse(b2$index=="ws","ws_type","ub_type")))))

b2$LANDFORM=as.factor(b2$LANDFORM)

Thanks to everyone who gave me some guidance!

Upvotes: 0

nicola
nicola

Reputation: 24480

If your new levels are just the first two letters of the old ones followed by _type you can easily achieve what you want through:

     #prototype of your column
     mycol<-factor(sample(c("aflb","afub","afwb","afws","bfrlb","bfrwb","bfrws","lb","lbwb","lbws","wslb","wsub"), replace=TRUE, size=100))
     as.factor(paste(sep="",substr(mycol,1,2),"_type"))

Upvotes: 1

Related Questions