naco
naco

Reputation: 373

How to assign the output of a sapply loop to the original columns in a data frame without losing other columns

I a data frame with different columns that has string answers from different assessors, who used random upper or lower cases in their answers. I want to convert everything to lower case. I have a code that works as follows:

# Creating a reproducible data frame similar to what I am working with
dfrm <- data.frame(a = sample(names(islands))[1:20],
               b = sample(unname(islands))[1:20],
               c = sample(names(islands))[1:20],
               d = sample(unname(islands))[1:20],
               e = sample(names(islands))[1:20],
               f = sample(unname(islands))[1:20],
               g = sample(names(islands))[1:20],
               h = sample(unname(islands))[1:20])
# This is how I did it originally by writing everything explicitly:
dfrm1 <- dfrm
dfrm1$a <- tolower(dfrm1$a)
dfrm1$c <- tolower(dfrm1$c)
dfrm1$e <- tolower(dfrm1$e)
dfrm1$g <- tolower(dfrm1$g)
head(dfrm1) #Works as intended

The problem is that as the number of assessors increase, I keep making copy paste errors. I tried to simplify my code by writing a function for tolower, and used sapply to loop it, but the final data frame does not look like what I wanted:

# function and sapply:
dfrm2 <- dfrm
my_list <- c("a", "c", "e", "g")
my_low <- function(x){dfrm2[,x] <- tolower(dfrm2[,x])}
sapply(my_list, my_low) #Didn't work

# Alternative approach:
dfrm2 <- as.data.frame(sapply(my_list, my_low))
head(dfrm2) #Lost the numbers

What am I missing?

I know this must be a very basic concept that I'm not getting. There was this question and answer that I simply couldn't follow, and this one where my non-working solution simply seems to work. Any help appreciated, thanks!

Upvotes: 0

Views: 117

Answers (2)

lmo
lmo

Reputation: 38500

Maybe you want to create a logical vector that selects the columns to change and run an apply function only over those columns.

# only choose non-numeric columns
changeCols <- !sapply(dfrm, is.numeric)

# change values of selected columns to lower case
dfrm[changeCols] <- lapply(dfrm[changeCols], tolower)

If you have other types of columns, say logical, you also could be more explicit regarding the types of columns that you want to change. For example, to select only factor and character columns, use.

changeCols <- sapply(dfrm, function(x) is.factor(x) | is.character(x))

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521249

For your first attempt, if you want the assignments to your data frame dfrm2 to stick, use the <<- assignment operator:

my_low <- function(x){ dfrm2[,x] <<- tolower(dfrm2[,x]) }
sapply(my_list, my_low)

Demo

Upvotes: 1

Related Questions