Johan Leander
Johan Leander

Reputation: 70

extract values from one data frame given two variables from another data frame

I have two data frames similar to these

date = c("2014-07-06", "2014-07-07","2014-07-08") 
temp_0m = c(12, 11, 13) 
temp_1m = c(11, 9, 10) 
temp_2m = c(9, 9, 8) 
temp_3m = c(7, 7, 6) 
foo = data.frame(date, temp_0m, temp_1m, temp_2m, temp_3m)   

DATE = c("2014-07-06", "2014-07-06","2014-07-06","2014-07-07", "2014-07-07","2014-07-07","2014-07-08","2014-07-08","2014-07-08")
TIME = c("01:01:01", "10:02:02", "18:03:03","01:01:01", "10:02:02", "18:03:03","01:01:01", "10:02:02", "18:03:03")
depth = c(1.2, 2.3, 0.4, 1.5, 2.2, 3.4, 2.2, 1.2, 0.5)
bar = data.frame(DATE, TIME, depth)

I want to extract values (temperatures) from "foo" given the date and depths from "bar". As you can see I have multiple observation on each date in the "bar"-data frame and I would like each of these depth observation to be given the temp value from "foo" depending on depth and date.

The result I'm looking for with this example data would be the following:

#> NEWbar
#        DATE     TIME depth NEWtemp
#1 2014-07-06 01:01:01   1.2 11
#2 2014-07-06 10:02:02   2.3 9
#3 2014-07-06 18:03:03   0.4 12
#4 2014-07-07 01:01:01   1.5 9
#5 2014-07-07 10:02:02   2.2 9
#6 2014-07-07 18:03:03   3.4 7
#7 2014-07-08 01:01:01   2.2 8
#8 2014-07-08 10:02:02   1.2 10
#9 2014-07-08 18:03:03   0.5 13

My real data frames are a lot bigger than these so a minimum amount of manual work would be a good idea.

Upvotes: 0

Views: 240

Answers (2)

akrun
akrun

Reputation: 887991

Here, I am assuming that depth '0-1' corresponds to 'temp_0m', '1-2' to 'temp_1m' and so on... We can take the ceiling of 'depth' column after merging the 'foo' and 'bar' datasets (common column is 'date'). If the 'temp' columns are in order, then we use the ceiling output as 'column' index, cbind it with 1:nrow to extract the 'temp' elements based on row/column index. Create a new column 'temp' in the merged dataset and subset the data to include only columns from 'bar'.

d1 <- merge(foo, bar)
d2 <- d1[grep('temp', names(d1))]
d1$temp <- d2[cbind(1:nrow(d2),ceiling(d1$depth))]
d1[c('date', 'depth', 'temp')]
#        date depth temp
#1 2014-07-06   1.2   11
#2 2014-07-07   2.3    9
#3 2014-07-08   0.4   13

Update

Based on the new dataset

 colnames(foo)[1] <- 'DATE'
 d1 <- merge(foo, bar)
 d2 <- d1[grep('temp', names(d1))]
 d1$temp <- d2[cbind(1:nrow(d2), ceiling(d1$depth))]
 d1[c('DATE', 'TIME', 'depth', 'temp')]
 #        DATE     TIME depth temp
 #1 2014-07-06 01:01:01   1.2   11
 #2 2014-07-06 10:02:02   2.3    9
 #3 2014-07-06 18:03:03   0.4   12
 #4 2014-07-07 01:01:01   1.5    9
 #5 2014-07-07 10:02:02   2.2    9
 #6 2014-07-07 18:03:03   3.4    7
 #7 2014-07-08 01:01:01   2.2    8
 #8 2014-07-08 10:02:02   1.2   10
 #9 2014-07-08 18:03:03   0.5   13

Upvotes: 1

Rajesh S
Rajesh S

Reputation: 108

I think this should work for you:

library(dplyr)
dplyr::left_join(x = bar, y = foo)
Joining by: "date"
        date depth temp_0m temp_1m temp_2m temp_3m
1 2014-07-06   1.2      12      11       9       7
2 2014-07-07   2.3      11       9       9       7
3 2014-07-08   0.4      13      10       8       6

Upvotes: 0

Related Questions