user2603386
user2603386

Reputation:

Replace Values in R - Error Received

I have a dataframe, titled gen, which is a data frame made up of A's, C's, G's, T's, and 0's. I would like to replace the A with a 1, the C with a 2, the G with a 3, and the T with a 4. When I try using the code gen1[gen1 == "A"] = 1, I get the error message:

Warning messages:
1: In `[<-.factor`(`*tmp*`, thisvar, value = "1") :
  invalid factor level, NAs generated

The resulting data frame has all of the A's replaced, but there are NA's instead of 1's.

Does anyone know how to do this correctly?

Thanks

Upvotes: 1

Views: 2175

Answers (2)

rhitz
rhitz

Reputation: 1892

You can do this by setting argument stringAsFactors = False while making the Data Frame. By default it is true.

Example Code:

d <- data.frame(a=c('A','C','G','T','0'),b=c('C','A','G','A','0'), stringsAsFactors = FALSE)
> d
  a b
1 A C
2 C A
3 G G
4 T A
5 0 0
> d[d=='A']<- '1'
> d
  a b
1 1 C
2 C 1
3 G G
4 T 1
5 0 0

Upvotes: 0

agstudy
agstudy

Reputation: 121568

solution:

You can use coerce your column factors to integer using as.integer:

Using sapply:

sapply(gen1,as.integer)

and colwise from plyr:

library(plyr)
colwise(as.integer)(gen1)

For example, I generate first a data.frame of A,B,C and D:

 set.seed(1)
gen1 <- as.data.frame(matrix(sample(LETTERS[1:4], 4 * 5, rep = TRUE), ncol = 4))
##   V1 V2 V3 V4
## 1  B  D  A  B
## 2  B  D  A  C
## 3  C  C  C  D
## 4  D  C  B  B
## 5  A  A  D  D
library(plyr)
colwise(as.integer)(gen1)
##   V1 V2 V3 V4
## 1  2  3  1  1
## 2  2  3  1  2
## 3  3  2  3  3
## 4  4  2  2  1
## 5  1  1  4  3
sapply(gen1, as.integer)
##      V1 V2 V3 V4
## [1,]  2  3  1  1
## [2,]  2  3  1  2
## [3,]  3  2  3  3
## [4,]  4  2  2  1
## [5,]  1  1  4  3

Why do you get the warning?

The warning messages is explicit , invalid factor level, NAs generated.

You get the error because you try to modify a factor value with a value that don't belong to levels set, So it will replaced by NA. I will reproduce the error :

h <- data.frame(xx = factor(c("A","B")) )
h[h == "A"] <- "C"   ## C don't belong to levels of xx 
Warning message:
In `[<-.factor`(`*tmp*`, thisvar, value = "C") :
  invalid factor level, NA generated

Upvotes: 1

Related Questions