Callahan McGovern
Callahan McGovern

Reputation: 13

How can I add a new column and use an existing column in a data frame in R?

I am trying to add a column called "Visited" which looks at an existing column called "Visits" and if "visits" = NA, then i want "visited" to = 0, but if "visits" > 0, "visited" should = 1. I am getting an error which states "Error in mutate(Visited = if (Visits == "NA") { : object 'Visits' not found". Thank you for all advice!! Here is my code.

mutate(Visited = 
  if(Visits == "NA") {
  replace("NA", 0)
  } else {
  replace(1)
  }
)````

Upvotes: 1

Views: 52

Answers (3)

bird
bird

Reputation: 3294

ifelse should do the trick. Note: df can be replaced by your dataframes name:

df$Visited = ifelse(is.na(df$Visits), 0, 1)

If you prefer dplyr:

library(dplyr)
df = df %>%
        mutate(Visited = ifelse(is.na(Visits), 0, 1))

Upvotes: 2

r2evans
r2evans

Reputation: 160407

Some issues:

  1. You cannot use if in a mutate like this: I'm inferring that your data is more than one row, in which case Visits == "NA" will be a logical vector, length greater than 1. The if conditional must have length-1. What you likely need is a vectorized conditional such as ifelse or replace.

    There are a few things to realize: vectorized conditionals do not short-circuit (&& and || do short-circuit, & and | do not, and you cannot just interchange them); and ifelse has issues with classes other than logical, integer, numeric, and character.

  2. Your use of replace is incorrect: it requires three arguments, it infers nothing. You cannot use just replace(0) hoping that it will know to look for a conditional outside of its call.

  3. There is a big difference between the R symbol NA (which can be numeric, logical, string, etc) and the string "NA". There are times when mis-read data gives you strings of "NA", but typically it's not. Note that NA == . anything is going to be NA (not true/false), since NA can be interpreted as "can be anything" as well as "not applicable". Because of this, if you have NAs in your code, then . == "NA" is going to first internally coerce the data to strings, which does not convert NA to "NA", and then look for the literal "NA", not what you want/need. I hope that makes sense.

  4. The error message suggests that you are not passing in data. mutate(Visited = ...) works fine if the call to mutate is in a dplyr/magrittr "pipe" (%>%), but by itself mutate requires its first argument to be a data.frame, as in mutate(mydata, Visited=...).

Here are some equivalent alternatives that should work for you:

mydata %>%
  mutate(
    Visited1 = ifelse(!is.na(Visits) & Visits > 0, 1, 0),
    Visited2 = replace(rep(1, n()), is.na(Visits) | Visits <= 0, 0),
    Visited3 = +(!is.na(Visits) & Visits > 0)
  )

The third takes advantage of R's coercion from logical to integer with the +(.) shortcut.

You pick which you prefer.

Upvotes: 4

Chris Ruehlemann
Chris Ruehlemann

Reputation: 21400

library(dplyr)
df %>%
  mutate(Visited = if_else(is.na(Visits), 0, 1))
  Visits Visited
1     NA       0
2      2       1
3      1       1
4     NA       0
5      5       1 

Data:

df <- data.frame(
  Visits = c(NA, 2, 1, NA, 5)
)

Upvotes: 0

Related Questions