Anthony Damico
Anthony Damico

Reputation: 6104

what's the most efficient way to perform the same operation(s) on multiple data frames?

my apologies if this is a duplicate, i couldn't find it anywhere..

say i've got a bunch of data frames, and i want to convert all of their column names to lowercase. what's the most efficient way to do this? it's straightforward with assign and get but i'm wondering if there's a faster way?

if i've just got ChickWeight and mtcars, the non-dynamic operation would simply be..

names( ChickWeight ) <- tolower( names( ChickWeight ) )
names( mtcars ) <- tolower( names( mtcars ) )

..and then here's how i would make this process dynamic, but i wonder if there's a more efficient solution?

# column headers contain uppercase
head(ChickWeight)

# start with a vector of data frame names..
# this might contain many, many data frames
tl <- c( 'ChickWeight' , 'mtcars' )

# loop through each data frame name..
for ( i in tl ){
    # save it to a temporary object name
    x <- get( i )

    # main operations here..

    # perform the operation(s) you want to run on each data frame
    names( x ) <- tolower( names( x ) )

    # ..end of main operations


    # assign the updated data frame to overwrite the original data frame
    assign( i , x )
}

# no longer contains uppercase
head(ChickWeight)

Upvotes: 3

Views: 637

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226182

I don't think you're likely to gain a whole lot of speed by changing approaches. A more idiomatic way to do this would be to store all of your data frames in a list and use something like `

dlist <- list(mtcars,ChickWeight)

(or)

namevec <- c("mtcars","ChickWeight")
dlist <- lapply(namevec,get)

then:

dlist <- lapply(dlist,function(x) setNames(x,tolower(names(x))))

... but of course in order to use this approach you have to commit to referring to the data frames as list elements, which in turn affects the whole structure of your analysis. If you don't want to do that then I don't see anything much better than your get/assign approach.

If you want to assign the values of the list back to the global environment you can do:

invisible(mapply(assign,namevec,dlist,MoreArgs=list(envir=.GlobalEnv)))

I want to emphasize that this is not necessarily faster or more transparent than the simple approach presented in the original post.

Upvotes: 2

Related Questions