Reputation: 862
I tried to find a solution in SOF but I didn't find anything...
I've got a two data frame with this kind of data.
> df
1 |UNIMOD:730
2 |UNIMOD:4
3 |UNIMOD:214
4 |UNIMOD:21
5 |UNIMOD:35
.
.
.
n+1500
And another one with this:
> df2
1 |UNIMOD:730
2 |UNIMOD:4
3 |UNIMOD:21
4 |UNIMOD:35
.
.
.
n+500
What I would like to have is this kind of output, where merge the column, comparing the values and add NA where the value doesn't exist. There are not repeated values.
> df
1 |UNIMOD:730 | UNIMOD:730
2 |UNIMOD:4 | UNIMOD:4
3 |UNIMOD:214 | NA
4 |UNIMOD:21 | UNIMOD:21
5 |UNIMOD:35 | UNIMOD:35
. .
. .
. .
n+1500 n+1500
I have tried with the option merge
but this function just merge in one column all my data and if a use this:
left_join(df, df2, c("sequence"="sequence"))
I just obtain the same result.
Here you have a reproducible example:
df <- data.frame(modifications=c("null", "0-UNIMOD:214", "2-UNIMOD:3","12-UNIMOD:24","1-UNIMOD:44","0-UNIMOD:12", "0-UNIMOD:123", "13-UNIMOD:212"))
df2 <- data.frame(modifications=c("null", "0-UNIMOD:24", "2-UNIMOD:3","12-UNIMOD:24","1-UNIMOD:44","0-UNIMOD:12"))
Upvotes: 0
Views: 59
Reputation: 50668
Is this what you're after (base R only, using ?match
)?
# Your data with added two columns
df1 <- cbind.data.frame(modifications=c("null", "0-UNIMOD:214", "2-UNIMOD:3","12-UNIMOD:24","1-UNIMOD:44","0-UNIMOD:12", "0-UNIMOD:123", "13-UNIMOD:212"),
df1col2 = "something",
df1col3 = "val1");
df2 <- cbind.data.frame(modifications=c("null", "0-UNIMOD:24", "2-UNIMOD:3","12-UNIMOD:24","1-UNIMOD:44","0-UNIMOD:12"),
df2col2 = "anotherthing",
df2col3 = "val2");
# Merge df1 and merge2
df <- cbind.data.frame(df1, df2[match(df1$modifications, df2$modifications), ]);
modifications df1col2 df1col3 modifications df2col2 df2col3
1 null something val1 null anotherthing val2
NA 0-UNIMOD:214 something val1 <NA> <NA> <NA>
3 2-UNIMOD:3 something val1 2-UNIMOD:3 anotherthing val2
4 12-UNIMOD:24 something val1 12-UNIMOD:24 anotherthing val2
5 1-UNIMOD:44 something val1 1-UNIMOD:44 anotherthing val2
6 0-UNIMOD:12 something val1 0-UNIMOD:12 anotherthing val2
NA.1 0-UNIMOD:123 something val1 <NA> <NA> <NA>
NA.2 13-UNIMOD:212 something val1 <NA> <NA> <NA>
# Or merge and remove the duplicate modifcations column (if necessary)
df <- cbind.data.frame(df1, df2[match(df1$modifications, df2$modifications), -1]);
print(df);
modifications df1col2 df1col3 df2col2 df2col3
1 null something val1 anotherthing val2
NA 0-UNIMOD:214 something val1 <NA> <NA>
3 2-UNIMOD:3 something val1 anotherthing val2
4 12-UNIMOD:24 something val1 anotherthing val2
5 1-UNIMOD:44 something val1 anotherthing val2
6 0-UNIMOD:12 something val1 anotherthing val2
NA.1 0-UNIMOD:123 something val1 <NA> <NA>
NA.2 13-UNIMOD:212 something val1 <NA> <NA>
Upvotes: 1