Mark C
Mark C

Reputation: 391

subsetting data tables by dynamic column names

I'm trying to subset a table using dynamic column names, but cannot get the following statement to work

mm2myModuleByYear[grep(i,colnames(mm2myModuleByYear),value=TRUE)==mId,authId]

Using the sample data below

i<-1997
mId<-37

mm2myModuleByYear<-structure(list(authId = c(220, 2269, 2270, 2271, 2991, 2992), 
        module1994 = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, 
        NA_integer_, NA_integer_), module1995 = c(NA_integer_, NA_integer_, 
        NA_integer_, NA_integer_, NA_integer_, NA_integer_), module1996 = c(NA_integer_, 
        NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA_integer_
        ), module1997 = c(1428L, 669L, 37L, NA, NA, NA), module1998 = c(1428L, 
        669L, 37L, NA, 832L, 832L), module1999 = c(1428L, 669L, 37L, 
        NA, 832L, 832L), module2000 = c(31L, 136L, 8L, NA, 1046L, 
        1046L), module2001 = c(31L, 136L, 8L, NA, 1046L, 1046L), 
        module2002 = c(31L, 136L, 8L, NA, 1046L, 1046L), module2003 = c(31L, 
        136L, 8L, 2314L, 1046L, 1046L), module2004 = c(955L, 320L, 
        10L, 1791L, 1361L, 1361L), module2005 = c(955L, 320L, 10L, 
        1791L, 1361L, 1361L), module2006 = c(955L, 320L, 10L, 1791L, 
        1361L, 1361L), module2007 = c(955L, 320L, 10L, 1791L, 1361L, 
        1361L), module2008 = c(955L, 320L, 10L, 1791L, 1361L, 1361L
        ), module2009 = c(16L, 374L, 11L, 1960L, 1544L, 1544L), module2010 = c(16L, 
        374L, 11L, 1960L, 1544L, 1544L), module2011 = c(16L, 374L, 
        11L, 1960L, 1544L, 1544L), module2012 = c(16L, 374L, 11L, 
        1960L, 1544L, 1544L), module2013 = c(16L, 374L, 11L, 1960L, 
        1544L, 1544L)), .Names = c("authId", "module1994", "module1995", 
    "module1996", "module1997", "module1998", "module1999", "module2000", 
    "module2001", "module2002", "module2003", "module2004", "module2005", 
    "module2006", "module2007", "module2008", "module2009", "module2010", 
    "module2011", "module2012", "module2013"), sorted = "module1996", class = c("data.table", 
    "data.frame"), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x2697d88>)

However, if I do something vary similar, like

mm2myModuleByYear[module1997==mId,grep(i,colnames(mm2myModuleByYear)),with=FALSE]

This works. Am I doing something incorrectly? How do I conditionally set the subset column in a data table?

Upvotes: 3

Views: 233

Answers (1)

Arun
Arun

Reputation: 118799

Let's look at your expression in i:

grep(i,colnames(mm2myModuleByYear),value=TRUE)
[1] "module1997" 

Therefore the expression:

grep(i,colnames(mm2myModuleByYear),value=TRUE)==mId
# [1] FALSE

would return FALSE (of course "module1997" != 37). What you intend here is to fetch the column returned by your grep() expression. To to that, you can use get() from base R.

with(mm2myModuleByYear, get(grep(i,colnames(mm2myModuleByYear),value=TRUE)))
# [1] 1428  669   37   NA   NA   NA

In short, you're missing a get() in your i-expression.

mm2myModuleByYear[get(grep(i,colnames(mm2myModuleByYear),value=TRUE))==mId, authId]
# [1] 2270

Upvotes: 3

Related Questions