Stu Richards
Stu Richards

Reputation: 141

Create Index based off Column

I have two data sets:

# 1.
user_id  users    frequency
1        1        3
2        1        1
3        1        1

# 2.
user_id  sum      unique
1        2        1
2        0        0
3        1        1

I want to merge on user_id but have index based on unit1 in an ordinal fashion so the output would look like, leaving user_id out of the picture:

# 3.
frequency users sum    unique
3         1     2      1
1         2     1      1

Any thoughts on how to achieve this? Also, in terms of learning how to do these types of manipulations, is their a name for this type of manipulation?

Upvotes: 0

Views: 84

Answers (2)

akrun
akrun

Reputation: 887681

Here is an option using tidyverse. We can do an inner_join between the two datasets, grouped by 'frequency', we get the sum of the variables within summarise_each

library(dplyr)
inner_join(df1, df2) %>%
       group_by(frequency) %>% 
       summarise_each(funs(sum), sum, unique, users)
#    frequency   sum unique users
#      <int> <int>  <int> <int>
#1         1     1      1     2
#2         3     2      1     1

Or using base R, we merge the datasets and do an aggregate

aggregate(.~frequency, merge(df1, df2)[-1], FUN = sum)
#    frequency users sum unique
#1         1     2   1      1
#2         3     1   2      1

Upvotes: 1

joel.wilson
joel.wilson

Reputation: 8413

library(data.table)
setDT(df)         # this step was to make it a data.table, if its a data.frame
setDT(df1)

# logic is : first merge both df's, then group by "frequency" columns
df[df1][, lapply(.SD, sum), by = .(frequency), .SDcols = c("sum", "unique", "users")]
#   frequency sum unique users
#1:         3   2      1     1
#2:         1   1      1     2

Upvotes: 2

Related Questions