Reputation: 615
I have two csv files:
csv1 <- data.frame(y=c("classA", "classB", "classA", "classB", "classA", "classC"),
DBID=c("d1", "d1", "d2", "d3", "d3", "d3"))
y DBID
1 classA d1
2 classB d1
3 classA d2
4 classB d3
5 classA d3
6 classC d3
csv2 <- data.frame(tm=c("t1","t1","t2"),
y=c("classA","classC","classB"))
tm y
1 t1 classA
2 t1 classC
3 t2 classB
I want to extract information to get a table by matching column y in both csv files, i.e.
t1 has classA and classC in csv2 file, so, all the DBID classified as classA in csv1 (d1,d2 and d3) are listed in the resulting dataframe with t1 in the first column, d1,d2 and d3 as the second column
t2 has class B in csv2 file, so, all the DBID classified as classB in csv1 (d1 and d3) are listed in the result dataframe with t2 listed in the first column, d1 and d3 as the second column.
and get a dataframe as follows:
tm DBID endcol
t1 d1 1
t1 d2 1
t1 d3 1
t1 d3 1
t2 d1 1
t2 d3 1
Please instruct how to do so with R.
Upvotes: 0
Views: 2432
Reputation: 173577
Maybe merge
?
> merge(csv1,csv2)
y DBID tm
1 classA d1 t1
2 classA d2 t1
3 classA d3 t1
4 classB d1 t2
5 classB d3 t2
6 classC d3 t1
You can add the column of all ones yourself. merge
is (by default) merging the two based on columns with identical names, which is why I didn't have to pass any other arguments. If you have other column names that match, you'll need to specify the by
argument explicitly to get the behavior you want.
Upvotes: 3