maraboule
maraboule

Reputation: 363

R dataframe to "dictionary" avoiding list of factors

I have a dataframe df with two columns, one containing names and the second one the values which can be strings or doubles, for example

> df
       name   value
1  cat_name    Bart
2   cat_age       5
3  dog_name    Fred
4   dog_age       9
5 total_pet       2

I'd like to convert df into a list of named objects so I can call list$cat_name and get back a string "Bart" or list$bird_age and get back 1 as a numeric.

I've tried

> list <- split(df[, 2], df[, 1])
> list
$cat_age
[1] 5
Levels: 2 5 9 Bart Fred

$cat_name
[1] Bart
Levels: 2 5 9 Bart Fred

$dog_age
[1] 9
Levels: 2 5 9 Bart Fred

$dog_name
[1] Fred
Levels: 2 5 9 Bart Fred

$total_pet
[1] 2
Levels: 2 5 9 Bart Fred

which transforms df into a list of factors. It's nearly what I want because the $ operator works fine. However, I'm not really used to be working with factors and I'd like to know if there was another dataframe-to-list transformation available out there. The annoying part comes from the fact that in order to work with strings and numbers we must convert the factors back to those types

> as.character(list$cat_name)
[1] "Bart"
> as.numeric(as.character(list$total_pet))
[1] 3

After noticing that df[, 1] and df[, 2] are actually factors I've tried using

> list <- split(as.character(df[, 2]), df[, 1])
> list
$cat_age
[1] "5"

$cat_name
[1] "Bart"

$dog_age
[1] "9"

$dog_name
[1] "Fred"

$total_pet
[1] "2"

which nearly solves the problem except that numbers are characters to be converted later. I've also tried using hash objects

> h <- hash(as.vector(df[, 1]), as.vector(df[, 2]))
> l = as.list(h)
> l
$dog_age
[1] "9"

$dog_name
[1] "Fred"

$cat_age
[1] "5"

$total_pet
[1] "2"

$cat_name
[1] "Bart"

but I have the same result.

Does anyone have advice ? Am I missing something obvious ?

Tanks :)

Upvotes: 3

Views: 192

Answers (2)

akrun
akrun

Reputation: 887118

We can do this with type.convert

library(purrr)
map(list, type.convert, as.is = TRUE)
#$cat_age
#[1] 5

#$cat_name
#[1] "Bart"

#$dog_age
#[1] 9

#$dog_name
#[1] "Fred"

#$total_pet
#[1] 2

As this could be more efficient by implementing parallelly, one option is future_map from furrr

library(furrr)
plan(multiprocess)
future_map(list, type.convert, as.is = TRUE)

Upvotes: 1

Jilber Urbina
Jilber Urbina

Reputation: 61154

An R base approach...

df[,]<- lapply(df, as.character) # changing factors to character
list <- split(df[, 2], df[, 1])  # Split df just as you did.

list2 <- lapply(list, function(x) {
  y <- regmatches(x, regexpr("\\d", x));
  z <-ifelse(length(y)!=0, as.numeric(y), x);
  z
})

$cat_age
[1] 5

$cat_name
[1] "Bart"

$dog_age
[1] 9

$dog_name
[1] "Fred"

$total_pet
[1] 2

Checking class:

> sapply(list2, class)
    cat_age    cat_name     dog_age    dog_name   total_pet 
  "numeric" "character"   "numeric" "character"   "numeric" 

Your data is:

df <- read.table(text="      name   value
1  cat_name    Bart
                 2   cat_age       5
                 3  dog_name    Fred
                 4   dog_age       9
                 5 total_pet       2", header=TRUE)

Upvotes: 0

Related Questions