millie0725
millie0725

Reputation: 371

How to extract information from a dataframe name and create a column based on it

Here's some mock data that represents the data I have:

pend4P_17k <- data.frame(x = c(1, 2, 3, 4, 5),
                  var1 = c('a', 'b', 'c', 'd', 'e'),
                  var2 = c(1, 1, 0, 0, 1))
pend5P_17k <- data.frame(x = c(1, 2, 3, 4, 5),
                  var1 = c('a', 'b', 'c', 'd', 'e'),
                  var2 = c(1, 1, 0, 0, 1))

I need to add a column to each data frame that represents the first letter/number code within the dataframe name, so for each dataframe I've been doing the following:

pend4P_17k$Pendant_ID<-"4P"
pend5P_17k$Pendant_ID<-"5P"

However, I have many dataframes to apply this to, so I'd like to create a function that can pull the information out of the dataframe name and apply it to a new column. I have attempted to use regular expressions and pattern matching to create a function, but with no luck (I'm very new to regular expressions).

Using R version 3.5.1, Mac OS X 10.13.6

Upvotes: 1

Views: 124

Answers (3)

zx8754
zx8754

Reputation: 56219

Using mget and rbindlist:

library(data.table)

m1 <- mtcars[1:2, 1:3]
m2 <- mtcars[3:4, 1:3]

rbindlist(mget(ls(pattern = "^m")), id = "myDF")
#    myDF  mpg cyl disp
# 1:   m1 21.0   6  160
# 2:   m1 21.0   6  160
# 3:   m2 22.8   4  108
# 4:   m2 21.4   6  258

Upvotes: 1

Ma&#235;l
Ma&#235;l

Reputation: 52209

This will do the trick:

require(dplyr)

f<-function(begin, end){
  ids<-seq(begin,end)
  listdf<-lapply(ids, function(x) eval(parse(text=paste0("pend", x,"P_17k"))))
  names(listdf)<-lapply(ids, function(x) paste0("pend", x,"P_17k"))
  len<-seq(1,length(listdf))
  
  for (i in len){
    listdf[[i]]<-listdf[[i]] %>% mutate(Pendant_ID=paste0(i+3,"P"))
  }
  
  list2env(listdf,.GlobalEnv)
}

Gives the desired output:

> f(4,5)
<environment: R_GlobalEnv>

> pend4P_17k
  x var1 var2 Pendant_ID
1 1    a    1         4P
2 2    b    1         4P
3 3    c    0         4P
4 4    d    0         4P
5 5    e    1         4P

> pend5P_17k
  x var1 var2 Pendant_ID
1 1    a    1         5P
2 2    b    1         5P
3 3    c    0         5P
4 4    d    0         5P
5 5    e    1         5P

Upvotes: 1

Allan Cameron
Allan Cameron

Reputation: 174278

This seems like a pretty bad idea. It's better to keep your data frames in a list rather than strewn about the global environment. However, if you're insistent it is possible:

add_name_cols <- function()
{
  my_global <- ls(envir = globalenv())
  for(i in my_global)
  if(class(get(i)) == "data.frame" & grepl("pend", i))
  {
    df <- get(i)
    df$Pendant_ID <- gsub("^pend(.{2})_.*$", "\\1", i)
    assign(i, df, envir = globalenv())
  }
}

add_name_cols()

pend4P_17k
#>   x var1 var2 Pendant_ID
#> 1 1    a    1         4P
#> 2 2    b    1         4P
#> 3 3    c    0         4P
#> 4 4    d    0         4P
#> 5 5    e    1         4P

pend5P_17k
#>   x var1 var2 Pendant_ID
#> 1 1    a    1         5P
#> 2 2    b    1         5P
#> 3 3    c    0         5P
#> 4 4    d    0         5P
#> 5 5    e    1         5P

Upvotes: 3

Related Questions