nir020
nir020

Reputation: 97

ifelse returning only numeric value

I am a new R user and have just started working with dataframes. I am trying to create a new column within a dataframe (using the code below). The problem is the new column created contains numeric values, yet all the columns used in the code are non-numeric

I have tried looking for answer on-line but cannot find an answer

dataframe$newcol <- ifelse(dataframe$colA == "London", dataframe$colA, dataframe$colB)'

Upvotes: 3

Views: 542

Answers (2)

jay.sf
jay.sf

Reputation: 72683

You could write a small new ifelse.fac function for this purpose.

ifelse.fac <- Vectorize(function(x, y, z) if (x) y else z)

Applying on data yields:

dat$newcol <- ifelse.fac(dat$colA == "London", dat$colA, dat$colB)
dat
#         colA          colB    newcol
# 1     London not in France    London
# 2     London not in France    London
# 3     London not in France    London
# 4     London not in France    London
# 5      Paris     in France in France
# 6  Marseille     in France in France
# 7      Paris     in France in France
# 8      Paris     in France in France
# 9     London not in France    London
# 10 Marseille     in France in France

And the factor structure remains intact:

str(dat)
# 'data.frame': 10 obs. of  3 variables:
# $ colA  : Factor w/ 3 levels "London","Marseille",..: 1 1 1 1 3 2 3 3 1 2
# $ colB  : Factor w/ 2 levels "in France","not in France": 2 2 2 2 1 1 1 1 2 1
# $ newcol: Factor w/ 5 levels "London","Marseille",..: 1 1 1 1 4 4 4 4 1 4

Data

dat <- structure(list(colA = structure(c(1L, 1L, 1L, 1L, 3L, 2L, 3L, 
3L, 1L, 2L), .Label = c("London", "Marseille", "Paris"), class = "factor"), 
    colB = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L
    ), .Label = c("in France", "not in France"), class = "factor")), row.names = c(NA, 
-10L), class = "data.frame")

head(dat)
#        colA          colB
# 1    London not in France
# 2    London not in France
# 3    London not in France
# 4    London not in France
# 5     Paris     in France
# 6 Marseille     in France

Upvotes: 0

MatthewR
MatthewR

Reputation: 2770

R defaults alot of character columns to factors, which can be a little tricky.

You can look at the class of variables like this

sapply( dataframe, class )

or

str( dataframe )

You can convert multiple columns like this:

dataframe[ , c("colA" ,"colB") ] <- sapply( dataframe[ , c("colA" ,"colB") ] , as.character )

you can convert one column at a time like this

dataframe$colA <- as.character( dataframe$colA )

if you are converting numeric cols do it like this

dataframe$colX <- as.numeric( as.character( dataframe$colX ))

Your code should work now - note that I changed == to %in%

dataframe$newcol <- ifelse(dataframe$colA %in% "London", dataframe$colA, dataframe$colB)

you can save yourself typing by using transform here

dataframe <- transform( dataframe , newcol = ifelse( colA %in% "London", colA, colB))

Upvotes: 2

Related Questions