Reputation: 73
In excel you can create a new_column and define it to have a value 0 if the old_column is blank, and be 1 if old_column is not blank.
new_column=IF(ISBLANK([@[old_column]]),0,1)
Anyone can think of an efficient way to do this for a data frame in r. Say a column in the data frame is called old_column, I want to add a new_column to the data_frame with the above description.
I tried this:
mydf$old_column[is.na(mydf$old_column] <- 0
mydf$old_column[!is.na(mydf$old_column] <- 1
but it gives me this error:
invalid factor level, NA generated error.
Upvotes: 2
Views: 66
Reputation: 10203
mydf$new_column <- as.integer(!is.na(mydf$old_column)
You may not even need the as.integer()
as is.na()
returns a boolean vector and R treats TRUE
and FALSE
as 1
and 0
respectively.
Upvotes: 3
Reputation: 780
If you want to follow the same scheme than in excel, then you are looking for ifelse:
mydf$new_column <- ifelse(is.na(mydf$old_column),0,1)
Also, in your original code, note that you assign the value to the old_column, so the second command will not find any NA's and will assign 1 everywhere.
The factor problem (guessing here), can be related to how you have loaded the data from your csv or xls file into R. Many methods have a parameter stringsAsFactors that you may want to set to False otherwise you can run into this type of errors. Provide the code, and we can help you. Example:
mydf <- read.csv("myfile.csv",stringsAsFactors = F)
Upvotes: 3