Gauthier Boaglio
Gauthier Boaglio

Reputation: 10262

R: Best way to mimic a dictionary with "non-string" keys

I've seen some ways to mimic a dictionary in R using the function names. But names/keys have to be strings...

What I want is an efficient way to use keys of any kind:

Let's say I have the following dictionary structure

dico = list(list(dep=list(11, 22), candidates=list(list(grp.id=15, sim.score=0.8))))
dico[[2]] <- list(dep=list(33, 44), candidates=list(list(grp.id=155, sim.score=0.88)))

What would be the best way to retrieve the entry from the dictionary where key is list(33, 44)?

The operator to compare keys would be identical(), but I can't figure out how to formulate the request, but doing something like (finds the id using which):

key = list(33, 44)
which(sapply(dico, function(x) { identical(x$dep, key) } ))

Maybe, the data structure should also be something else, in the first place?

Any suggestion would be greatly appreciated.


EDIT: Richie's data.framed approach would do the job if I do as follows:

the_data <- data.frame(
  dep1        = c(11, 33),
  dep2        = c(22, 44),
  candidates  = c( list(list(grp.id=15, sim.score=0.8)), list(list(grp.id=155, sim.score=0.88)) )
)

Since, I need the list of candidates (couple [grp.id, sim.score]) for each key to be extendable. That is: 1 -> N relationship between a dep and its related candidates...

Upvotes: 0

Views: 139

Answers (1)

Richie Cotton
Richie Cotton

Reputation: 121127

It looks like you are overcomplicating things with your data structure. If each list item (a person?) has four values associated with them, store the data in a data frame.

the_data <- data.frame(
  dep1                = c(11, 33),
  dep2                = c(22, 44),
  candidate.grp.id    = c(15, 155),
  candidate.sim.score = c(0.8, 0.88)
)

Then you can use subset, or standard indexing to retrieve what you want.

subset(the_data, dep1 == 33 & dep2 == 44)

From your comments, I half suspect (it isn't very clear) that several candidates can have the same dep values. In this case, just repeat the dep values in the data frame.

So this complicated data structure:

dico <- list(
  list(
    dep        = list(11, 22), 
    candidates = list(
      list(
        grp.id    = 15, 
        sim.score = 0.8
      )
    )
  ),
  list(
    dep        = list(33, 44), 
    candidates = list(
      list(
        grp.id    = 155, 
        sim.score = 0.88
      )
    )
  ),
  list(
    dep        = list(33, 44), 
    candidates = list(
      list(
        grp.id    = 99, 
        sim.score = 0.99
      )
    )
  )
)

simplifies to:

the_data <- data.frame(
  dep1                = c(11, 33, 33),
  dep2                = c(22, 44, 44),
  candidate.grp.id    = c(15, 155, 99),
  candidate.sim.score = c(0.8, 0.88, 0.99)
)

Upvotes: 3

Related Questions