John Bennett
John Bennett

Reputation: 33

Programmatically calling Object based on "names" stored in variables

I have a question about referencing objects by retrieving the name of the object from a variable.

SETUP

library(data.table)
object <- c("one", "two", "three")
attributes <- c("green, blue, red", "red", "blue, orange")
DT <- data.table(object,attributes) ; DT

   object       attributes
1:    one green, blue, red
2:    two              red
3:  three     blue, orange

This is the base setup I have (simplified data). I have objects with names and each has attributes assigned. The attributes are in the original dataset as comma delimited strings in a single cell of the table. The attributes come from a finite, knowable list of attributes. In this example I use colors. I need to be able to find objects by attribute. So, subset out all the object with "red" as an attribute. (in the real world example there are 20k objects and ~200 attributes) What I want, after receiving the raw data and creating a data.table, is to create flag columns for all the possible attributes to facilitate searches/sub-setting. So this…

DT[, isRed := FALSE]
DT[, isGreen := FALSE]
DT[, isBlue := FALSE]
DT[, isOrange := FALSE]
DT
   object       attributes isRed isGreen isBlue isOrange
1:    one green, blue, red FALSE   FALSE  FALSE    FALSE
2:    two              red FALSE   FALSE  FALSE    FALSE
3:  three     blue, orange FALSE   FALSE  FALSE    FALSE

This creates my baseline data.table, all the flag columns are in place and set FALSE prior to processing.

The processing is to take the attribute string, parse out the individual attributes, and set the flag accordingly. This is what I am doing…

# take the first object, parse the attributes into a data.table
split.attributes <- 
  str_split(DT[object == "one", attributes], ",", n = Inf) %>% 
  transpose() %>% 
  data.table() 
split.attributes
       .
1: green
2:  blue
3:   red

# format the attributes with initial Uppercase, and update the data.table
# ignore the extraneous string manipulates (like "\\s") in the real world example 
# the attributes are sometimes two word strings that are then a 
# single flag name, i.e., "blue green" -> "BlueGreen"
split.attributes <- split.attributes[,.] %>% 
  str_to_title() %>% 
  str_remove("\\s") %>% 
  as.list() %>% 
  data.table()
split.attributes

       .
1: Green
2:  Blue
3:   Red

I already have all the flag column names in the form of "is", i.e, "isRed", so convert the data.table…

# paste "is" in front of the attribute and change the column name to avoid referring to "." later
split.attributes[, col.names := paste0("is",.)] 
split.attributes
       . col.names
1: Green   isGreen
2:  Blue    isBlue
3:   Red     isRed
# then remove the extraneous column
split.attributes[, . := NULL]
split.attributes
   col.names
1:   isGreen
2:    isBlue
3:     isRed

I now have a set of flag names (that match the actual column names) for the first object in my original data table and I want to assign new values (TRUE) to those flags. What I want to do is call the value from split.attributes[1] and use it as the name of a column in DT. I know one way to do this…

eval(parse( text = (paste0("DT[1, ", eval(split.attributes[i]), " := TRUE]"))))
DT
   object       attributes isRed isGreen isBlue isOrange
1:    one green, blue, red FALSE    TRUE  FALSE    FALSE
2:    two              red FALSE   FALSE  FALSE    FALSE
3:  three     blue, orange FALSE   FALSE  FALSE    FALSE

And my one flag is now TRUE, so we know object::one is "isGreen"::TRUE. Of course with looping I can set all necessary flags for all objects. I have seen a lot of specific solutions, but they all follow the basic idea of; turn the variable into a string, concatenate that string with the other strings necessary to build your expression, and then evaluate the full string as an expression.

QUESTION

Is there a better way than, "eval(parse( text = (paste0("DT[1, ", eval(split.attributes[i]), " := TRUE]"))))"?
In my mind this is a common problem (or maybe my personal project is unique in this regard), so I feel like you should be able to do something like;
DT[1, ", get_the_variable_value_and_add_as_part_of_a_function ( split.attributes[i] ), " := TRUE]
Which would then create the underlying expersion you want and what gets sent to R is;
DT[1, isGreen := TRUE] (as an expression to be evaluated)
Nice and neat, no fuss, no layered functions.

NOTE: I realize I could make my own function for this, but what I'm asking is "does one already exist and I just haven't found it?". I'm just trying to see if anyone knows something I don't that would make my life easier. THANKS.

Upvotes: 1

Views: 73

Answers (1)

s_baldur
s_baldur

Reputation: 33488

Here is one alternative:

DT[, attributes := strsplit(attributes, ", ")] # Convert to a list column
all_attr <- unique(unlist(DT$attributes))
DT[, 
   paste0("is_", all_attr) := lapply(all_attr, `%chin%`, attributes[[1]]), 
   by = object]

   object     attributes is_green is_blue is_red is_orange
1:    one green,blue,red     TRUE    TRUE   TRUE     FALSE
2:    two            red    FALSE   FALSE   TRUE     FALSE
3:  three    blue,orange    FALSE    TRUE  FALSE      TRUE

Another alternative:

DT[, lapply(.SD, function(x) strsplit(x, ", ")[[1]]), by = object
   ][, x := TRUE
     ][, dcast(.SD, object ~ paste0("is_", attributes), value.var = "x", fill = FALSE)]

Upvotes: 1

Related Questions