Reputation: 41397
I have the following dataframe df and list l (dput
below):
> df
group value
1 A 1
2 B 2
3 C 3
> l
$A
[1] 999
$B
[1] 55
I would like to join the values of the list to the dataframe based on the names in the list with the group variable of the dataframe and call it "value_l". The expected output should look like this:
group value value_l
1 A 1 999
2 B 2 55
3 C 3 NA
So I was wondering if anyone knows how to join a list to a dataframe based on their names?
dput
df and l:
df <- structure(list(group = c("A", "B", "C"), value = c(1, 2, 3)), class = "data.frame", row.names = c(NA,
-3L))
l <- list(A = 999, B = 55)
Upvotes: 12
Views: 1412
Reputation: 39687
You can use match
. In case df$group
is a character (what is here the case) it could be directly used to subset the list.
df$value_l <- l[match(df$group, names(l))]
#df$value_l <- l[df$group] #Short alternative by @akrun works only in case df$group is a character, but not for factor or numeric
#df$value_l <- l[as.character(df$group)] #Maybe more secure
df
# group value value_l
#1 A 1 999
#2 B 2 55
#3 C 3 NULL
In case there is a need for NA
, instead of NULL
use in addition:
df$value_l[vapply(df$value_l, is.null, TRUE)] <- NA
df
# group value value_l
#1 A 1 999
#2 B 2 55
#3 C 3 NA
Or make it in single steps:
. <- match(df$group, names(l))
df$value_l <- l[.]
is.na(df$value_l) <- is.na(.)
Here we have joined a list
to a data.frame
.
str(df)
#'data.frame': 3 obs. of 3 variables:
# $ group : chr "A" "B" "C"
# $ value : num 1 2 3
# $ value_l:List of 3
# ..$ A : num 999
# ..$ B : num 55
# ..$ NA: logi NA
In case the List can be trasfomed to a vector you can use unlist
before (thanks to @G. Grothendieck for the comment). But here we have then joined a vector
to the data.frame
.
df$value_l <- unlist(l)[match(df$group, names(l))]
#df$value_l <- unlist(l)[as.character(df$group)] #Option like shown above
df
# group value value_l
#1 A 1 999
#2 B 2 55
#3 C 3 NA
str(df)
#'data.frame': 3 obs. of 3 variables:
# $ group : chr "A" "B" "C"
# $ value : num 1 2 3
# $ value_l: num 999 55 NA
Another option, also joined a vector
to the data.frame
will be using merge
.
merge(df, unlist(l), by.x="group", by.y=0, all.x = TRUE)
# group value y
#1 A 1 999
#2 B 2 55
#3 C 3 NA
Note: For the given list
the results look similar but this will not be the case if the list
looks e.g. like:
l <- list(A = 999, B = c(7, 55), A = 9)
A potential solution might be:
Taking first match:
df$value_l <- l[as.character(df$group)]
df
# group value value_l
#1 A 1 999
#2 B 2 7, 55
#3 C 3 NULL
Making a left Join
merge(df, list2DF(list(group = names(l), value_l = l)), all.x=TRUE)
#merge(df, data.frame(group = names(l), value_l = I(l)), all.x=TRUE) #Alternative
# group value value_l
#1 A 1 999
#2 A 1 9
#3 B 2 7, 55
#4 C 3 NA
Other options.
merge(df, list2DF(list(group = names(l), value_l = l))) #Inner
merge(df, list2DF(list(group = names(l), value_l = l)), all=TRUE) #Outer
merge(df, list2DF(list(group = names(l), value_l = l)), all.y=TRUE) #Right
For other options please have a look at How to join (merge) data frames (inner, outer, left, right).
Upvotes: 10
Reputation: 615
This is a simpler version of what GKi suggested with unlist()
. If your list always has a name and a single numeric value, you can convert it to a named vector and then use it as a lookup vector, which is simpler than doing merges or matches:
temp_vec = unlist(l)
df$l_value = temp_vec[df$group]
df
group value l_value
1 A 1 999
2 B 2 55
3 C 3 NA
Without the intermediate variable for a single line solution:
df$l_value = unlist(l)[df$group]
df
group value l_value
1 A 1 999
2 B 2 55
3 C 3 NA
Depending on what else you need the list for, it may even make sense just to use a named vector instead of a list in the first place.
Upvotes: 5
Reputation: 887291
Use merge
from base R
merge(df, stack(l), by.x = 'group', by.y = 'ind', all.x = TRUE)
group value values
1 A 1 999
2 B 2 55
3 C 3 NA
Or with dplyr
library(dplyr)
df %>%
rowwise %>%
mutate(value_l = if(group %in% names(l)) l[[group]] else NA) %>%
ungroup
-output
# A tibble: 3 × 3
group value value_l
<chr> <dbl> <dbl>
1 A 1 999
2 B 2 55
3 C 3 NA
Or using enframe/unnest
library(tidyr)
library(tibble)
enframe(l, name = 'group', value = 'value_l') %>%
unnest(value_l) %>%
left_join(df, .)
group value value_l
1 A 1 999
2 B 2 55
3 C 3 NA
Or if it can be a list column
df$value_l <- l[df$group]
> df
group value value_l
1 A 1 999
2 B 2 55
3 C 3 NULL
Upvotes: 6
Reputation: 78947
Update:
Maybe this one:
library(dplyr)
stack(unlist(l)) %>%
full_join(df, by=c("ind"="group"))
values ind value
1 999 A 1
2 55 B 2
3 NA C 3
First answer: Slightly different:
library(dplyr)
library(tidyr)
bind_rows(l) %>%
pivot_longer(everything()) %>%
full_join(df, by=c("name"="group")) %>%
select(name, value = value.y, value_l=value.x)
name value value_l
<chr> <dbl> <dbl>
1 A 1 999
2 B 2 55
3 C 3 NA
Upvotes: 5
Reputation: 10996
You can do:
library(tidyverse)
l |>
as.data.frame() |>
pivot_longer(cols = everything(),
names_to = "group",
values_to = "value_1") |>
left_join(x = df,
y = _,
by = "group")
which gives:
group value value_1
1 A 1 999
2 B 2 55
3 C 3 NA
Upvotes: 6