Joe
Joe

Reputation: 1768

Numeric column is coerced to character column when another character column is modified

I have a data frame with two columns, the first one contains numbers, the second one strings. My problem is: once I replace a string in the second column by another string, the first column is coerced from class numeric to character. Here is an example:

df <- data.frame(num = c(1,2), char = c("a", "b"), stringsAsFactors = F)
class(df$num) # "numeric"
class(df$char) # "character"
df[df$char == "a", ] <- "c"
class(df$char) # "character" 
class(df$num) # "character" !!

What's the reason for this behavior and how to stop it?

Upvotes: 0

Views: 46

Answers (2)

David Kaufman
David Kaufman

Reputation: 1069

Look at df after you change it:

> df
  num char
1   c    c
2   2    b
> 

So of course $num has become character. Your command (because of its comma syntax) identified entire rows to be changed.

A different substitution command

df[df == "a"] <- "c"

does what you were expecting.

Upvotes: 0

Joe
Joe

Reputation: 1768

I found my error: df[df$char == "a", ] <- "c" overwrites the whole row, which is why the first column is coerced. The correct way to replace "a" by "c" is: df$char[df$char == "a"] <- "c".

Upvotes: 2

Related Questions