Reputation: 1342

Convert dataframe column to 1 or 0 for "true"/"false" values and assign to dataframe

In the R cli I am able to do the following on a character column in a data frame:

> data.frame$column.name [data.frame$column.name == "true"] <- 1
> data.frame$column.name [data.frame$column.name == "false"] <- 0
> data.frame$column.name <- as.integer(data.frame$column.name)

I would like to do this as a function and I tried the following code, inputing data.frame$column.name as arg1. I see that it is working when I return(arg1) but how do I return the operation to the original data.frame?

boolean.integer <- function(arg1) {
  arg1 [arg1 == "true"] <- 1
  arg1 [arg1 == "false"] <- 0
  arg1 <- as.integer(arg1)
}

Upvotes: 47

Answers (8)

Ant

Reputation: 39

You could use binarize as well.

Upvotes: 0

xihajun

Reputation: 77

data$new_col[data$col == "True"] <- 1       # Replace true by 1
data$new_col[data$col == "False"] <- 0      # Replace false by 0

Upvotes: 0

AndrewGB

Reputation: 16856

Another base R option is to use +, which will convert logical values into integer values (i.e., TRUE = 1 and FALSE = 0). Here, I first convert the true and false values to logical (i.e., TRUE and FALSE).

data.frame(sapply(df, \(x) +as.logical(x)))

Note: \(x) is just a shorter notation for function(x).

Output

  p1_1 p1_2 p1_3 p1_4 p1_5 p1_6 p1_7 p1_8 p1_9 p1_10 p1_11
1    1    0    1    0    1    1    1    0    1     1     0
2    0    1    1   NA   NA   NA   NA    0    0     0     0

Tidyverse

This same notation can also be used in mutate from tidyverse:

library(tidyverse)

df %>% 
  mutate(across(everything(), ~+as.logical(.x)))

Data

df <- structure(list(p1_1 = c("true", "false"), p1_2 = c("false", "true"
), p1_3 = c("true", "true"), p1_4 = c("false", NA), p1_5 = c("true", 
NA), p1_6 = c("true", NA), p1_7 = c("true", NA), p1_8 = c("false", 
"false"), p1_9 = c("true", "false"), p1_10 = c("true", "false"
), p1_11 = c("false", "false")), row.names = c(NA, -2L), class = "data.frame")

#   p1_1  p1_2 p1_3  p1_4 p1_5 p1_6 p1_7  p1_8  p1_9 p1_10 p1_11
#1  true false true false true true true false  true  true false
#2 false  true true  <NA> <NA> <NA> <NA> false false false false

Upvotes: 6

user9641147

Reputation: 31

Try this, it will convert True into 1 and False into 0:

data.frame$column.name.num  <- as.numeric(data.frame$column.name)

Then you can convert into factor if you want:

data.frame$column.name.num.factor <- as .factor(data.frame$column.name.num)

Upvotes: 3

Estatistics

Reputation: 946

Even when you asked finally for the opposite, to reform 0s and 1s into Trues and Falses, however, I post an answer about how to transform falses and trues into ones and zeros (1s and 0s), for a whole dataframe, in a single line.

Example given

df <- structure(list(p1_1 = c(TRUE, FALSE, FALSE, NA, TRUE, FALSE, 
                NA), p1_2 = c(FALSE, TRUE, FALSE, NA, FALSE, NA, 
                TRUE), p1_3 = c(TRUE, 
                TRUE, FALSE, NA, NA, FALSE, TRUE), p1_4 = c(FALSE, NA, 
                FALSE,  FALSE, TRUE, FALSE, NA), p1_5 = c(TRUE, NA, 
                FALSE, TRUE, FALSE, NA, TRUE), p1_6 = c(TRUE, NA, 
                FALSE, TRUE, FALSE, NA, TRUE), p1_7 = c(TRUE, NA, 
                FALSE, TRUE, NA, FALSE, TRUE), p1_8 = c(FALSE, 
                FALSE, NA, FALSE, TRUE, FALSE, NA), p1_9 = c(TRUE, 
                FALSE,  NA, FALSE, FALSE, NA, TRUE), p1_10 = c(TRUE, 
                FALSE, NA, FALSE, FALSE, NA, TRUE), p1_11 = c(FALSE, 
                FALSE, NA, FALSE, NA, FALSE, TRUE)), .Names = 
                c("p1_1", "p1_2", "p1_3", "p1_4", "p1_5", "p1_6", 
                "p1_7", "p1_8", "p1_9", "p1_10", "p1_11"), row.names = 
                 c(NA, -7L), class = "data.frame")

   p1_1  p1_2  p1_3  p1_4  p1_5  p1_6  p1_7  p1_8  p1_9 p1_10 p1_11
1  TRUE FALSE  TRUE FALSE  TRUE  TRUE  TRUE FALSE  TRUE  TRUE FALSE
2 FALSE  TRUE  TRUE    NA    NA    NA    NA FALSE FALSE FALSE FALSE
3 FALSE FALSE FALSE FALSE FALSE FALSE FALSE    NA    NA    NA    NA
4    NA    NA    NA FALSE  TRUE  TRUE  TRUE FALSE FALSE FALSE FALSE
5  TRUE FALSE    NA  TRUE FALSE FALSE    NA  TRUE FALSE FALSE    NA
6 FALSE    NA FALSE FALSE    NA    NA FALSE FALSE    NA    NA FALSE
7    NA  TRUE  TRUE    NA  TRUE  TRUE  TRUE    NA  TRUE  TRUE  TRUE

Then by running that: df * 1 all Falses and Trues are trasnformed into 1s and 0s. At least, this was happen in the R version that I have (R version 3.4.4 (2018-03-15) ).

> df*1
  p1_1 p1_2 p1_3 p1_4 p1_5 p1_6 p1_7 p1_8 p1_9 p1_10 p1_11
1    1    0    1    0    1    1    1    0    1     1     0
2    0    1    1   NA   NA   NA   NA    0    0     0     0
3    0    0    0    0    0    0    0   NA   NA    NA    NA
4   NA   NA   NA    0    1    1    1    0    0     0     0
5    1    0   NA    1    0    0   NA    1    0     0    NA
6    0   NA    0    0   NA   NA    0    0   NA    NA     0
7   NA    1    1   NA    1    1    1   NA    1     1     1

I do not know if it a total "safe" command, under all different conditions / dfs.

Upvotes: 15

DecisionNerd

Reputation: 1342

@chappers solution (in the comments) works as.integer(as.logical(data.frame$column.name))

Upvotes: 64

A5C1D2H2I1M1N2O1R2T1

Reputation: 193517

Since you're dealing with values that are just supposed to be boolean anyway, just use == and convert the logical response to as.integer:

df <- data.frame(col = c("true", "true", "false"))
df
#     col
# 1  true
# 2  true
# 3 false
df$col <- as.integer(df$col == "true")
df
#   col
# 1   1
# 2   1
# 3   0

Upvotes: 3

Ajay Ohri

Reputation: 3492

can you try if.else

> col2=ifelse(df1$col=="true",1,0)
> df1
$col
[1] "true"  "false"

> cbind(df1$col)
     [,1]   
[1,] "true" 
[2,] "false"
> cbind(df1$col,col2)
             col2
[1,] "true"  "1" 
[2,] "false" "0"

Upvotes: 8

Convert dataframe column to 1 or 0 for &quot;true&quot;/&quot;false&quot; values and assign to dataframe

Answers (8)

Related Questions

Convert dataframe column to 1 or 0 for "true"/"false" values and assign to dataframe