Reputation: 3168
I need to create a new dataframe from rows from dataframe1, such that the value of dataframe1$column is a value found in dataframe2$colum
the dataframes are:
y <- "name,number,lunch
joe,2,peaches
steve,5,hotdog
jon,7,clamroll
nick,11,sloppyJoe"
x <- "number,office
1,1b
2,1a
3,2s
4,4d
5,f4
6,f4
7,h3
8,g3
9,j7
10,d3
11,jk"
df1 <- read.csv(textConnection(df1), header=TRUE, sep=",", stringsAsFactors=FALSE)
df2 <- read.csv(textConnection(df2), header=TRUE, sep=",", stringsAsFactors=FALSE)
I have tried:
df3 <- df1[which(df1$number == df2$number), ]
to no avail. How do I properly do this in R? I could write a perl script, but I have about 100 of these sets and don't want to create more temp files.
Upvotes: 1
Views: 12217
Reputation: 69201
Joris' answer is spot on. The merge()
command can also be useful for this type of stuff. If you are familiar with SQL joins, you can draw parallels between most of the options in merge()
and the different join operations.
#Inner join
> merge(df1,df2)
number office name lunch
1 2 1a joe peaches
2 5 f4 steve hotdog
3 7 h3 jon clamroll
4 11 jk nick sloppyJoe
#Right join:
> merge(df1,df2, all.x = TRUE)
number office name lunch
1 1 1b <NA> <NA>
2 2 1a joe peaches
3 3 2s <NA> <NA>
4 4 4d <NA> <NA>
5 5 f4 steve hotdog
6 6 f4 <NA> <NA>
7 7 h3 jon clamroll
8 8 g3 <NA> <NA>
9 9 j7 <NA> <NA>
10 10 d3 <NA> <NA>
11 11 jk nick sloppyJoe
Upvotes: 1
Reputation: 108543
again, the %in%
trick :
> df1[df1$number %in% df2$number,]
number office
2 2 1a
5 5 f4
7 7 h3
11 11 jk
For what it's worth, you can easily just do a merge if you want to combine them. In this case I'd say that's the cleanest solution : gives you every office of the occuring employees, and matches them :
> merge(df1,df2)
number office name lunch
1 2 1a joe peaches
2 5 f4 steve hotdog
3 7 h3 jon clamroll
4 11 jk nick sloppyJoe
Check the help files of merge for more options, you can do a whole lot with it.
Upvotes: 5