Reputation: 147
I am wondering how to pull data from one data frame into another when there are two different filtering critera.
my dataframe A is
TransactionNo Date AccountNo TransactionType Amount Currency
1 1000001 2019-07-01 10001 Spend -12.44 SGD
2 1000002 2019-07-01 10001 Spend -31.92 CNY
3 1000003 2019-07-03 10001 Spend -8.08 USD
4 1000004 2019-07-04 10001 Spend -6.02 SGD
5 1000005 2019-07-05 10001 Spend -5.04 USD
6 1000006 2019-07-06 10001 Spend -8.43 SGD
my dataframe B is
Date USD CNY SGD
<dttm> <dbl> <dbl> <dbl>
1 2019-07-01 00:00:00 1.35 0.198 1
2 2019-07-02 00:00:00 1.40 0.198 1
3 2019-07-03 00:00:00 1.36 0.197 1
4 2019-07-04 00:00:00 1.37 0.197 1
5 2019-07-05 00:00:00 1.38 0.197 1
6 2019-07-06 00:00:00 1.39 0.197 1
I want to create a new column in dataframe A called Exchange. The entries in this new column can be taken from dataframe B. The entries taken would depend on the matching of Currency and matching of Date within both dataframes. For example, the first row in dataframe A would be 1. Second row is 0.198. Third row is 1.36.
The end code would look something like
A <- mutate(A, Exchange = XXXXXXXXXXX)
I understand that everyone has their own responsibilities to fulfil in life, and as such your time is precious. I really appreciate all of you sacrificing your valuable time and energy in helping new programmers like me to learn basic programming. Thank you all so much.
Upvotes: 3
Views: 183
Reputation: 5798
Not as efficient as Akrun's, but another Base R solution:
# Reshape B from wide to long:
long_B <-
data.frame(
reshape(
B,
direction = "long",
varying = names(B)[names(B) != "Date"],
v.names = "Exchange",
idvar = "Date",
timevar = "Currency",
times = names(B)[names(B) != "Date"]
),
row.names = NULL
)
# Left join on intersecting vectors:
a_left_join_b <-
merge(A,
long_B,
by = intersect(colnames(A), colnames(long_B)),
all.x = TRUE)
A tidyverse alternative:
ab <-
B %>%
gather(key = "Currency", value = "Exchange", -Date) %>%
right_join(A, by = intersect(colnames(.), colnames(A)), all.y = TRUE)
A <-
structure(
list(
TransactionNo = 1000001:1000006,
Date = as.POSIXct(
c(
"2019-07-01",
"2019-07-01",
"2019-07-03",
"2019-07-04",
"2019-07-05",
"2019-07-06"
),
tz = Sys.timezone()
),
AccountNo = c(10001L, 10001L, 10001L, 10001L, 10001L, 10001L),
TransactionType = c("Spend", "Spend", "Spend", "Spend", "Spend",
"Spend"),
Amount = c(-12.44, -31.92, -8.08, -6.02, -5.04, -8.43),
Currency = c("SGD", "CNY", "USD", "SGD", "USD", "SGD")
),
class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6")
)
B <-
structure(
list(
Date = as.POSIXct(
c(
"2019-07-01 00:00:00",
"2019-07-02 00:00:00",
"2019-07-03 00:00:00",
"2019-07-04 00:00:00",
"2019-07-05 00:00:00",
"2019-07-06 00:00:00"
),
tz = Sys.timezone()
),
USD = c(1.35, 1.4, 1.36, 1.37, 1.38,
1.39),
CNY = c(0.198, 0.198, 0.197, 0.197, 0.197, 0.197),
SGD = c(1L,
1L, 1L, 1L, 1L, 1L)
),
class = "data.frame",
row.names = c("1",
"2", "3", "4", "5", "6")
)
Upvotes: 1
Reputation: 887991
In base R
, create an index for rows and columns, cbind
it and extract the values
i1 <- match(as.Date(A$Date), as.Date(B$Date))
j1 <- match(A$Currency, names(B)[-1])
A$Exchange <- B[-1][cbind(i1, j1)]
A$Exchange
#[1] 1.000 0.198 1.360 1.000 1.380 1.000
Or using tidyverse
, we reshape the data from 'wide' to 'long' ('B') and then do a join
library(dplyr)
library(tidyr)
B %>%
mutate(Date = as.Date(Date)) %>%
pivot_longer(cols = -Date, names_to = 'Currency', values_to = 'Exchange') %>%
right_join(A %>%
mutate(Date = as.Date(Date)))
# A tibble: 6 x 7
# Date Currency Exchange TransactionNo AccountNo TransactionType Amount
# <date> <chr> <dbl> <int> <int> <chr> <dbl>
#1 2019-07-01 SGD 1 1000001 10001 Spend -12.4
#2 2019-07-01 CNY 0.198 1000002 10001 Spend -31.9
#3 2019-07-03 USD 1.36 1000003 10001 Spend -8.08
#4 2019-07-04 SGD 1 1000004 10001 Spend -6.02
#5 2019-07-05 USD 1.38 1000005 10001 Spend -5.04
#6 2019-07-06 SGD 1 1000006 10001 Spend -8.43
A <- structure(list(TransactionNo = 1000001:1000006, Date = c("2019-07-01",
"2019-07-01", "2019-07-03", "2019-07-04", "2019-07-05", "2019-07-06"
), AccountNo = c(10001L, 10001L, 10001L, 10001L, 10001L, 10001L
), TransactionType = c("Spend", "Spend", "Spend", "Spend", "Spend",
"Spend"), Amount = c(-12.44, -31.92, -8.08, -6.02, -5.04, -8.43
), Currency = c("SGD", "CNY", "USD", "SGD", "USD", "SGD")),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
B <- structure(list(Date = c("2019-07-01 00:00:00", "2019-07-02 00:00:00",
"2019-07-03 00:00:00", "2019-07-04 00:00:00", "2019-07-05 00:00:00",
"2019-07-06 00:00:00"), USD = c(1.35, 1.4, 1.36, 1.37, 1.38,
1.39), CNY = c(0.198, 0.198, 0.197, 0.197, 0.197, 0.197), SGD = c(1L,
1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
Upvotes: 3