Reputation: 97
I am a new R user and have just started working with dataframes. I am trying to create a new column within a dataframe (using the code below). The problem is the new column created contains numeric values, yet all the columns used in the code are non-numeric
I have tried looking for answer on-line but cannot find an answer
dataframe$newcol <- ifelse(dataframe$colA == "London", dataframe$colA, dataframe$colB)'
Upvotes: 3
Views: 542
Reputation: 72683
You could write a small new ifelse.fac
function for this purpose.
ifelse.fac <- Vectorize(function(x, y, z) if (x) y else z)
Applying on data yields:
dat$newcol <- ifelse.fac(dat$colA == "London", dat$colA, dat$colB)
dat
# colA colB newcol
# 1 London not in France London
# 2 London not in France London
# 3 London not in France London
# 4 London not in France London
# 5 Paris in France in France
# 6 Marseille in France in France
# 7 Paris in France in France
# 8 Paris in France in France
# 9 London not in France London
# 10 Marseille in France in France
And the factor structure remains intact:
str(dat)
# 'data.frame': 10 obs. of 3 variables:
# $ colA : Factor w/ 3 levels "London","Marseille",..: 1 1 1 1 3 2 3 3 1 2
# $ colB : Factor w/ 2 levels "in France","not in France": 2 2 2 2 1 1 1 1 2 1
# $ newcol: Factor w/ 5 levels "London","Marseille",..: 1 1 1 1 4 4 4 4 1 4
Data
dat <- structure(list(colA = structure(c(1L, 1L, 1L, 1L, 3L, 2L, 3L,
3L, 1L, 2L), .Label = c("London", "Marseille", "Paris"), class = "factor"),
colB = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L, 1L
), .Label = c("in France", "not in France"), class = "factor")), row.names = c(NA,
-10L), class = "data.frame")
head(dat)
# colA colB
# 1 London not in France
# 2 London not in France
# 3 London not in France
# 4 London not in France
# 5 Paris in France
# 6 Marseille in France
Upvotes: 0
Reputation: 2770
R defaults alot of character columns to factors, which can be a little tricky.
You can look at the class of variables like this
sapply( dataframe, class )
or
str( dataframe )
You can convert multiple columns like this:
dataframe[ , c("colA" ,"colB") ] <- sapply( dataframe[ , c("colA" ,"colB") ] , as.character )
you can convert one column at a time like this
dataframe$colA <- as.character( dataframe$colA )
if you are converting numeric cols do it like this
dataframe$colX <- as.numeric( as.character( dataframe$colX ))
Your code should work now - note that I changed == to %in%
dataframe$newcol <- ifelse(dataframe$colA %in% "London", dataframe$colA, dataframe$colB)
you can save yourself typing by using transform here
dataframe <- transform( dataframe , newcol = ifelse( colA %in% "London", colA, colB))
Upvotes: 2