user21390049
user21390049

Reputation: 129

how to lowercase true and false in csv files in R

I want to ask if there is any way to lowercase 'TRUE' and 'FALSE' in csv files in R. I have tried tolower() function but every time I open the files, 'true' and 'false' are still in upper case, while I need them to be in lowercase as other words.

I hope to receive some help about this. Many thanks!

Upvotes: 0

Views: 75

Answers (2)

jay.sf
jay.sf

Reputation: 73572

Subset a vector c('false', 'true') which is much faster than coercing to char and lowercase.

> c('false', 'true')[df1$x + 1]
 [1] NA      "true"  "false" "true"  "true"  "true"  "true"  "false" "true"  "true" 

Benchmark

$ Rscript --vanilla foo.R
Unit: milliseconds
         expr        min         lq       mean     median         uq        max neval cld
 as.character 125.571624 127.805560 132.652394 128.533733 138.353091 145.753451    10  a 
       subset   5.071165   5.089829   6.038531   5.918286   6.606528   7.921659    10   b

Subset is about 95% faster.


Data:

set.seed(42)
n <- 1e1
df1 <- data.frame(x=rbinom(n, 1, .5) |> as.logical())
df1$x[1] <- NA
n <- 1e6
df2 <- data.frame(x=rbinom(n, 1, .5) |> as.logical())
df2$x[1] <- NA

microbenchmark::microbenchmark(
  as.character=tolower(as.character(df2$x)),
  subset=c('false', 'true')[df2$x + 1],
  check='equal', times=10L
)

Upvotes: 3

DaveArmstrong
DaveArmstrong

Reputation: 22034

I think this isn't an R problem. If I convert the text to lower and then look at the data frame, the values are in lower case.

d <- data.frame(x = c(TRUE, FALSE))
d$x <- tolower(as.character(d$x))
d
#>       x
#> 1  true
#> 2 false
write.csv(d, file="test.csv")

Created on 2025-02-04 with reprex v2.1.1

If I open the .csv up as a text file in a text editor (like the one in RStudio), I ge the following:

image of correct css file

As you can see, in the file these are lower-cased. However, I think the problem comes when opening the file in excel - that converts them to upper case.

excel image with capitalized values

It seems like this is a problem with how excel opens the file rather than with how R saves it.

Upvotes: 4

Related Questions