Chen JG Thomas
Chen JG Thomas

Reputation: 147

Pulling Data From One Data Frame Into Another

I am wondering how to pull data from one data frame into another when there are two different filtering critera.

my dataframe A is

  TransactionNo       Date AccountNo TransactionType Amount Currency
1       1000001 2019-07-01     10001           Spend -12.44      SGD
2       1000002 2019-07-01     10001           Spend -31.92      CNY
3       1000003 2019-07-03     10001           Spend  -8.08      USD
4       1000004 2019-07-04     10001           Spend  -6.02      SGD
5       1000005 2019-07-05     10001           Spend  -5.04      USD
6       1000006 2019-07-06     10001           Spend  -8.43      SGD

my dataframe B is

  Date                  USD   CNY   SGD
  <dttm>              <dbl> <dbl> <dbl>
1 2019-07-01 00:00:00  1.35 0.198     1
2 2019-07-02 00:00:00  1.40 0.198     1
3 2019-07-03 00:00:00  1.36 0.197     1
4 2019-07-04 00:00:00  1.37 0.197     1
5 2019-07-05 00:00:00  1.38 0.197     1
6 2019-07-06 00:00:00  1.39 0.197     1

I want to create a new column in dataframe A called Exchange. The entries in this new column can be taken from dataframe B. The entries taken would depend on the matching of Currency and matching of Date within both dataframes. For example, the first row in dataframe A would be 1. Second row is 0.198. Third row is 1.36.

The end code would look something like

A <- mutate(A, Exchange = XXXXXXXXXXX)

I understand that everyone has their own responsibilities to fulfil in life, and as such your time is precious. I really appreciate all of you sacrificing your valuable time and energy in helping new programmers like me to learn basic programming. Thank you all so much.

Upvotes: 3

Views: 183

Answers (2)

hello_friend
hello_friend

Reputation: 5798

Not as efficient as Akrun's, but another Base R solution:

# Reshape B from wide to long: 

    long_B <-
      data.frame(
        reshape(
          B,
          direction = "long",
          varying = names(B)[names(B) != "Date"],
          v.names = "Exchange",
          idvar = "Date",
          timevar = "Currency",
          times = names(B)[names(B) != "Date"]
        ),
        row.names = NULL
      )

# Left join on intersecting vectors: 

a_left_join_b <-
  merge(A,
        long_B,
        by = intersect(colnames(A), colnames(long_B)),
        all.x = TRUE)

A tidyverse alternative:

ab <- 

  B %>% 

  gather(key = "Currency", value = "Exchange", -Date) %>% 

  right_join(A, by = intersect(colnames(.), colnames(A)), all.y = TRUE)

Data:

A <-
  structure(
    list(
      TransactionNo = 1000001:1000006,
      Date = as.POSIXct(
        c(
          "2019-07-01",
          "2019-07-01",
          "2019-07-03",
          "2019-07-04",
          "2019-07-05",
          "2019-07-06"
        ),
        tz = Sys.timezone()
      ),
      AccountNo = c(10001L, 10001L, 10001L, 10001L, 10001L, 10001L),
      TransactionType = c("Spend", "Spend", "Spend", "Spend", "Spend",
                          "Spend"),
      Amount = c(-12.44, -31.92, -8.08, -6.02, -5.04, -8.43),
      Currency = c("SGD", "CNY", "USD", "SGD", "USD", "SGD")
    ),
    class = "data.frame",
    row.names = c("1",
                  "2", "3", "4", "5", "6")
  )

B <-
  structure(
    list(
      Date = as.POSIXct(
        c(
          "2019-07-01 00:00:00",
          "2019-07-02 00:00:00",
          "2019-07-03 00:00:00",
          "2019-07-04 00:00:00",
          "2019-07-05 00:00:00",
          "2019-07-06 00:00:00"
        ),
        tz = Sys.timezone()
      ),
      USD = c(1.35, 1.4, 1.36, 1.37, 1.38,
              1.39),
      CNY = c(0.198, 0.198, 0.197, 0.197, 0.197, 0.197),
      SGD = c(1L,
              1L, 1L, 1L, 1L, 1L)
    ),
    class = "data.frame",
    row.names = c("1",
                  "2", "3", "4", "5", "6")
  )

Upvotes: 1

akrun
akrun

Reputation: 887991

In base R, create an index for rows and columns, cbind it and extract the values

i1 <- match(as.Date(A$Date), as.Date(B$Date))
j1 <- match(A$Currency, names(B)[-1])
A$Exchange <- B[-1][cbind(i1, j1)]
A$Exchange
#[1] 1.000 0.198 1.360 1.000 1.380 1.000

Or using tidyverse, we reshape the data from 'wide' to 'long' ('B') and then do a join

library(dplyr)
library(tidyr)
B %>%
  mutate(Date = as.Date(Date)) %>% 
  pivot_longer(cols = -Date, names_to = 'Currency', values_to = 'Exchange') %>% 
  right_join(A %>%
               mutate(Date = as.Date(Date)))
# A tibble: 6 x 7
#  Date       Currency Exchange TransactionNo AccountNo TransactionType Amount
#  <date>     <chr>       <dbl>         <int>     <int> <chr>            <dbl>
#1 2019-07-01 SGD         1           1000001     10001 Spend           -12.4 
#2 2019-07-01 CNY         0.198       1000002     10001 Spend           -31.9 
#3 2019-07-03 USD         1.36        1000003     10001 Spend            -8.08
#4 2019-07-04 SGD         1           1000004     10001 Spend            -6.02
#5 2019-07-05 USD         1.38        1000005     10001 Spend            -5.04
#6 2019-07-06 SGD         1           1000006     10001 Spend            -8.43

data

A <- structure(list(TransactionNo = 1000001:1000006, Date = c("2019-07-01", 
"2019-07-01", "2019-07-03", "2019-07-04", "2019-07-05", "2019-07-06"
), AccountNo = c(10001L, 10001L, 10001L, 10001L, 10001L, 10001L
), TransactionType = c("Spend", "Spend", "Spend", "Spend", "Spend", 
"Spend"), Amount = c(-12.44, -31.92, -8.08, -6.02, -5.04, -8.43
), Currency = c("SGD", "CNY", "USD", "SGD", "USD", "SGD")),
class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

B <- structure(list(Date = c("2019-07-01 00:00:00", "2019-07-02 00:00:00", 
"2019-07-03 00:00:00", "2019-07-04 00:00:00", "2019-07-05 00:00:00", 
"2019-07-06 00:00:00"), USD = c(1.35, 1.4, 1.36, 1.37, 1.38, 
1.39), CNY = c(0.198, 0.198, 0.197, 0.197, 0.197, 0.197), SGD = c(1L, 
1L, 1L, 1L, 1L, 1L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))

Upvotes: 3

Related Questions