mab
mab

Reputation: 73

IF statement construction

In excel you can create a new_column and define it to have a value 0 if the old_column is blank, and be 1 if old_column is not blank.

new_column=IF(ISBLANK([@[old_column]]),0,1)

Anyone can think of an efficient way to do this for a data frame in r. Say a column in the data frame is called old_column, I want to add a new_column to the data_frame with the above description.

I tried this:

mydf$old_column[is.na(mydf$old_column] <- 0
mydf$old_column[!is.na(mydf$old_column] <- 1

but it gives me this error:

invalid factor level, NA generated error.

Upvotes: 2

Views: 66

Answers (2)

RoyalTS
RoyalTS

Reputation: 10203

mydf$new_column <- as.integer(!is.na(mydf$old_column)

You may not even need the as.integer() as is.na() returns a boolean vector and R treats TRUE and FALSE as 1 and 0 respectively.

Upvotes: 3

Picarus
Picarus

Reputation: 780

If you want to follow the same scheme than in excel, then you are looking for ifelse:

mydf$new_column <- ifelse(is.na(mydf$old_column),0,1)

Also, in your original code, note that you assign the value to the old_column, so the second command will not find any NA's and will assign 1 everywhere.

The factor problem (guessing here), can be related to how you have loaded the data from your csv or xls file into R. Many methods have a parameter stringsAsFactors that you may want to set to False otherwise you can run into this type of errors. Provide the code, and we can help you. Example:

mydf <- read.csv("myfile.csv",stringsAsFactors = F)

Upvotes: 3

Related Questions