owl
owl

Reputation: 2061

Stop R from converting a character factor to number

I am trying to convert missing factor values to NA in a data frame, and create a new data frame with replaced values but when I try to do that, previously character factors are all converted to numbers. I cannot figure out what I am doing wrong and cannot find a similar question. Could anybody please help?

Here are my codes:

orders <- c('One','Two','Three', '')
ids <- c(1, 2, 3, 4)
values <- c(1.5, 100.6, 19.3, '')

df <- data.frame(orders, ids, values)
new.df <- as.data.frame(matrix( , ncol = ncol(df), nrow = 0))
names(new.df) <- names(df)

for(i in 1:nrow(df)){
    row.df <- df[i, ]
    print(row.df$orders) # "One", "Two", "Three", ""
    print(str(row.df$orders)) # Factor
    # Want to replace "orders" value in each row with NA if it is missing 
    row.df$orders <- ifelse(row.df$orders == "", NA, row.df$orders)
    print(row.df$orders) # Converted to number
    print(str(row.df$orders)) # int or logi
    # Add the row with new value to the new data frame
    new.df[nrow(new.df) + 1, ] <- row.df
    }

and I get this:

> new.df
  orders ids values
1      2   1      2
2      4   2      3
3      3   3      4
4     NA   4      1

but I want this:

> new.df
  orders ids values
1    One   1    1.5
2    Two   2  100.6
3  Three   3   19.3
4     NA   4       

Upvotes: 1

Views: 616

Answers (2)

owl
owl

Reputation: 2061

Thanks to the hint from Ronak Shah, I did this and it gave me what I wanted.

df$orders[df$orders == ''] <- NA

This will give me:

> df
  orders ids values
1    One   1    1.5
2    Two   2  100.6
3  Three   3   19.3
4   <NA>   4       

> str(df)
'data.frame':   4 obs. of  3 variables:
 $ orders: Factor w/ 4 levels "","One","Three",..: 2 4 3 NA
 $ ids   : num  1 2 3 4
 $ values: Factor w/ 4 levels "","1.5","100.6",..: 2 3 4 1

In case you are curious about the difference between NA and as I was, you can find the answer here.

Your suggestion

df$orders[is.na(df$orders)] <- NA

did not work maybe becasuse missing entry is not NA?

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388992

Convert empty values to NA and use type.convert to change their class.

df[df == ''] <- NA
df <- type.convert(df)
df
#  orders ids values
#1    One   1    1.5
#2    Two   2  100.6
#3  Three   3   19.3
#4   <NA>   4     NA

str(df)
#'data.frame':  4 obs. of  3 variables:
#$ orders: Factor w/ 4 levels "","One","Three",..: 2 4 3 1
#$ ids   : int  1 2 3 4
#$ values: num  1.5 100.6 19.3 NA

Upvotes: 1

Related Questions