Ihaveaquestion
Ihaveaquestion

Reputation: 11

What am I doing wrong if when I merge two data frames in R, only parts of them are merged?

I have the following problem with merging two data frames, any help would be useful:

I have two .csv files:

File1.csv

ID   Value1 Value2  Value3  Mean
Oeuf    5       4       6    5
Lou     3       7       5    5
Bob     1       3       2    2
Bill    2       9       1    4

File2.csv

ID   P-Value    FDR
Lou    3        7   
Oeuf   5        4   
Bob    1        3       

I want to merge these two so that:

Merge.csv

ID    Value1        Value2      Value3      Mean    P-value FDR
Oeuf    5             4           6           5       5     4
Lou     3             7           5           5       3     7
Bob     1             3           2           2       1     3
Bill    2             9           1           4       NA    NA

If I do:

Merge.csv <- merge(File1.csv,File2.csv,by="ID", all.x=TRUE)

I get:

Merge.csv

ID    Value1        Value2      Value3      Mean    P-value    FDR
Oeuf    5              4          6           5        5        4
Lou     3              7          5           5        NA       NA
Bob     1              3          2           2        NA       NA
Bill    2              9          1           4        NA       NA

So, it does it correctly for Oeuf but not for Lou and Bob.

I would like the merge to be applied to all

Upvotes: 1

Views: 55

Answers (2)

Ihaveaquestion
Ihaveaquestion

Reputation: 11

It turns out that aosmith was right, the .csv files i was using had spaces or tabs after some of the IDs and so they looked the same but weren't recognized as being the same.

I used Word to Replace 'white spaces' and then it worked.

Thank you all for your help, I learned a lot :)

Upvotes: 0

Pierre Lapointe
Pierre Lapointe

Reputation: 16277

Works for me. Check that you don't have any factors with str(File1.csv) and str(File2.csv). If ID is a factor, it might mess up the results.

File1<-read.table(text="ID   Value1 Value2  Value3  Mean
Oeuf    5       4       6    5
Lou     3       7       5    5
Bob     1       3       2    2
Bill    2       9       1    4
", header=T,stringsAsFactors =F)


File2<-read.table(text="ID   P-Value    FDR
Lou    3        7
Oeuf   5        4
Bob    1        3
", header=T,stringsAsFactors =F)

merge(File1,File2,by="ID", all.x=TRUE)

> merge(File1,File2,by="ID", all.x=TRUE)
    ID Value1 Value2 Value3 Mean P.Value FDR
1 Bill      2      9      1    4      NA  NA
2  Bob      1      3      2    2       1   3
3  Lou      3      7      5    5       3   7
4 Oeuf      5      4      6    5       5   4

Upvotes: 1

Related Questions