Reputation: 8444
I have those 3 dataframes below:
Name<-c("jack","jack","bob","david","mary")
n1<-data.frame(Name)
Name<-c("jack","bill","dean","mary","steven")
n2<-data.frame(Name)
Name<-c("fred","alex","mary")
n3<-data.frame(Name)
I would like to create a new dataframe with 3 columns.All unique names present across all 3 source files in Column 1,the number of source files in which it's located, in Column 2, and the total number of instances of that name across all files, in Column 3.
The result should be like
Name Number_of_files Number_of_instances
1 jack 2 3
2 bob 1 1
3 david 1 1
4 mary 3 3
5 bill 1 1
6 dean 1 1
7 steven 1 1
8 fred 1 1
9 alex 1 1
Is there an automated way to achieve all these at once?
Upvotes: 2
Views: 46
Reputation: 389155
This is conceptually similar answer as @tmfmnk but a base R version
#Get names of all the objects n1, n2, n3, n4 . etc
name_df <- ls(pattern = "n\\d+")
#Combine them in one dataframe
all_df <- do.call(rbind, Map(cbind, mget(name_df), id = name_df))
#get aggregated values
aggregate(id~Name, all_df, function(x) c(length(unique(x)), length(x)))
# Name id.1 id.2
#1 bob 1 1
#2 david 1 1
#3 jack 2 3
#4 mary 3 3
#5 bill 1 1
#6 dean 1 1
#7 steven 1 1
#8 alex 1 1
#9 fred 1 1
You can rename the columns if needed.
And for completeness data.table
version
library(data.table)
dt < - rbindlist(mget(name_df), idcol = "ID")
dt[, list(Number_of_files = uniqueN(ID), Number_of_instances = .N), by = .(Name)]
Upvotes: 1
Reputation: 40141
One dplyr
possibility could be:
bind_rows(n1, n2, n3, .id = "ID") %>%
group_by(Name) %>%
summarise(Number_of_files = n_distinct(ID),
Number_of_instances = n())
Name Number_of_files Number_of_instances
<chr> <int> <int>
1 alex 1 1
2 bill 1 1
3 bob 1 1
4 david 1 1
5 dean 1 1
6 fred 1 1
7 jack 2 3
8 mary 3 3
9 steven 1 1
Upvotes: 4