Reputation: 31
The data frame X looks like this
State code
New Jersey 1
New York 2
Califronia NA
All columns are factors. I am looking to replace NA
is with a text or 0. So that I can transpose them later.
When I try to run this command
X[is.na(X)] <- "0"
I get following errors
Warning messages: 1: In `[<-.factor`(`*tmp*`, thisvar, value = "0") : invalid factor level, NA generated 2: In `[<-.factor`(`*tmp*`, thisvar, value = "0") : invalid factor level, NA generated 3: In `[<-.factor`(`*tmp*`, thisvar, value = "0") : invalid factor level, NA generated 4: In `[<-.factor`(`*tmp*`, thisvar, value = "0") : invalid factor level, NA generated
There is no change in NA
values.
Upvotes: 2
Views: 9347
Reputation: 91
let's create a random df with factor levels
df <- data.frame(a=sample(0:10, size=10, replace=TRUE),
b=sample(20:30, size=10, replace=TRUE))
df[df$a==0,'a'] <- NA
df$a <- as.factor(df$a)
other way to do is:
#check levels
levels(df$a)
#[1] "3" "4" "7" "9" "10"
#add new factor level. i.e 88 in our example
df$a = factor(df$a, levels=c(levels(df$a), 88))
#convert all NA's to 88
df$a[is.na(df$a)] = 88
#check levels again
levels(df$a)
#[1] "3" "4" "7" "9" "10" "88"
Upvotes: 0
Reputation: 11514
Another alternative using built-in factor
:
df <- data.frame(a=letters[1:3], b=c("d", "e", NA))
df
a b
1 a d
2 b e
3 c <NA>
Now, recode the factor with factor
:
df$b <- factor(df$b, exclude = NULL,
levels = c("d", "e", NA),
labels = c("d", "e", "f"))
df
a b
1 a d
2 b e
3 c f
And for many factors, the following may be useful:
df[] <- lapply(df, function(x){
# check if you have a factor first:
if(!is.factor(x)) return(x)
# otherwise include NAs into factor levels and change factor levels:
x <- factor(x, exclude=NULL)
levels(x)[is.na(levels(x))] <- "0"
return(x)
})
Upvotes: 4
Reputation: 12410
Simply:
X$code <- as.character(X$code) #as.numeric works just as good
X[is.na(X)] <- "0"
X$code <- as.factor(as.numeric(X$code))
In a loop over all columns it would look like this:
for (i in 2:ncol(X)) {
X[,i] <- as.character(X[,i])
X[which(is.na(X[,i])==TRUE),i] <- "0"
X[,i] <- as.factor(as.numeric(X[,i]))
}
And for a character value like this:
for (i in 2:ncol(X)) {
X[,i] <- as.character(X[,i])
X[which(is.na(X[,i])==TRUE),i] <- "Not Assigned"
X[,i] <- as.factor(X[,i])
}
Or if you prefer not to transform to character first, assign a new level to each column:
for (i in 2:ncol(X)) {
levels(X[,i]) <- c(levels(X[,i]), "Not Assigned")
X[which(is.na(X[,i])==TRUE),i] <- "Not Assigned"
}
Upvotes: 0
Reputation: 2240
The code you wrote will work for matrices, if you don't mind converting back and forth.
> X
State code code2
1 NewJersey 1 NA
2 NewYork 2 0
3 Califronia NA 4
> X<-as.matrix(X)
> X[is.na(X)] <- "0"
> X<-as.data.frame(X)
> X
State code code2
1 NewJersey 1 0
2 NewYork 2 0
3 Califronia 0 4
> str(X)
'data.frame': 3 obs. of 3 variables:
$ State: Factor w/ 3 levels "Califronia","NewJersey",..: 2 3 1
$ code : Factor w/ 3 levels " 1"," 2","0": 1 2 3
$ code2: Factor w/ 3 levels " 0"," 4","0": 3 1 2
Upvotes: 0