user2363642
user2363642

Reputation: 757

merge two dataframes R

i am trying to merge two dataframes in R. i would like to add df2 to df, so that the length of df is maintained, that is any unmatched rows are dropped. I need the smallest negative integer belonging to each person to always be matched to 'month1', the second smallest integer to be matched to month2, the third smallest integer matched to month 3 and so on.

df

names variable
1   jane       -4
2   jane       -3
3   jane       -2
4   john       -5
5   john       -4
6   john       -3
7   john       -2
8   john       -1
9   john        1
10  john        2
11  mary       -3
12  mary       -2
13  mary       -1
14  mary        1
15  mary        2
16  mary        3
17  mary        4
18   tom       -6
19   tom       -5
20   tom       -4
21   tom       -3
22   tom       -2
23   tom       -1
24   tom        1

df2
      noms  months
    1  jane  month1
    2  jane  month2
    3  jane  month3
    4  jane  month4
    5  jane  month5
    6  jane  month6
    7  jane  month7
    8  jane  month8
    9  jane  month9
    10 jane month10
    11 john  month1
    12 john  month2
    13 john  month3
    14 john  month4
    15 john  month5
    16 john  month6
    17 john  month7
    18 john  month8
    19 john  month9
    20 john month10
    21 mary  month1
    22 mary  month2
    23 mary  month3
    24 mary  month4
    25 mary  month5
    26 mary  month6
    27 mary  month7
    28 mary  month8
    29 mary  month9
    30 mary month10
    31  tom  month1
    32  tom  month2
    33  tom  month3
    34  tom  month4
    35  tom  month5
    36  tom  month6
    37  tom  month7
    38  tom  month8
    39  tom  month9
    40  tom month10

DESIRED OUTPUT

 names variable months
1   jane       -4 month1
2   jane       -3 month2
3   jane       -2 month3
4   john       -5 month1
5   john       -4 month2
6   john       -3 month3
7   john       -2 month4
8   john       -1 month5
9   john        1 month6
10  john        2 month7
11  mary       -3 month1
12  mary       -2 month2
13  mary       -1 month3
14  mary        1 month4
15  mary        2 month5
16  mary        3 month6
17  mary        4 month7
18   tom       -6 month1
19   tom       -5 month2
20   tom       -4 month3
21   tom       -3 month4
22   tom       -2 month5
23   tom       -1 month6
24   tom        1 month7

This seems like it should be a simple merge, but the code isn't working for me. This is what I have tried

final <- merge(df, df2, by.x = "names", by.y = "noms", sort=F,all.x=T,all.y=F)

have also tried this

x<-df2$names %in% df$noms   
y<-cbind(df2, x)                
matches<-y[y$x!=FALSE,]

I'm sure this is a fundamental problem, but my simple merge code won't work as it should. Thank you in advance for any help.

Upvotes: 2

Views: 1624

Answers (2)

jlhoward
jlhoward

Reputation: 59335

An approach using data tables:

library(data.table)
dt <- data.table(df)
dt[,months:=paste0("month",row(dt)[,1]),by=names]
head(dt,10)
#     names variable months
#  1:  jane       -4 month1
#  2:  jane       -3 month2
#  3:  jane       -2 month3
#  4:  john       -5 month1
#  5:  john       -4 month2
#  6:  john       -3 month3
#  7:  john       -2 month4
#  8:  john       -1 month5
#  9:  john        1 month6
# 10:  john        2 month7

Upvotes: 3

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

Here's another approach:

tab <- table(df$names) # count rows per name

# create vector with months
tmp <- unlist(lapply(names(tab), function(x) {
  head(df2$months[as.character(df2$noms) == x], tab[x])
}))

# create new data frame
final <- cbind(df, months = tmp)

The result:

   names variable months
1   jane       -4 month1
2   jane       -3 month2
3   jane       -2 month3
4   john       -5 month1
5   john       -4 month2
6   john       -3 month3
7   john       -2 month4
8   john       -1 month5
9   john        1 month6
10  john        2 month7
11  mary       -3 month1
12  mary       -2 month2
13  mary       -1 month3
14  mary        1 month4
15  mary        2 month5
16  mary        3 month6
17  mary        4 month7
18   tom       -6 month1
19   tom       -5 month2
20   tom       -4 month3
21   tom       -3 month4
22   tom       -2 month5
23   tom       -1 month6
24   tom        1 month7

Upvotes: 4

Related Questions