BGG16
BGG16

Reputation: 536

Can R do the equivalent of an HLOOKUP nested within a VLOOKUP?

I am trying (unsuccessfully) to do the equivalent of an HLOOKUP nested within a VLOOKUP in Excel using R Studio.

Here is the situation.

I have two tables. Table 1 has historical stock prices, where each column represents a ticker name and each row represents a particular date. Table 1 contains the closing stock price for each ticker on each date.

Assume Table 1 looks like this:

 |----------------------------|
 |   Date   |MSFT | AMZN |EPD | 
 |----------------------------|
 | 6/1/2020 | 196 | 2600 | 19 |
 | 5/1/2020 | 186 | 2200 | 20 |
 | 4/1/2020 | 176 | 2000 | 15 |
 | 3/1/2020 | 166 | 1800 | 14 |
 | 2/1/2020 | 170 | 2200 | 18 |
 | 1/1/2020 | 180 | 2300 | 17 |
 |----------------------------|

Table 2 has a list of ticker symbols, as well as two dates and placeholders for the stock price on each date. Date1 is always an earlier date than Date2, and each of Date1 and Date2 corresponds with a date in Table 1. Note that Date1 and Date2 are different for each row of Table 2.

My objective is to pull the applicable PriceOnDate1 and PriceOnDate2 into Table 2 similar to VLOOKUP / HLOOKUP functions in Excel. (I can't use Excel going forward on this, as the file is too big for Excel to handle). Then I can calculate the return for each row by a formula like this: (Date2 - Date1) / Date1

Assume I want Table 2 to look like this, but I am unable to pull in the pricing data for PriceOnDate1 and PriceOnDate2:

|-----------------------------------------------------------|
| Ticker | Date1    | Date2    |PriceOnDate1 |PriceOnDate2  |
|-----------------------------------------------------------|
| MSFT   | 1/1/2020 | 4/1/2020 | _________   | ________     | 
| MSFT   | 2/1/2020 | 6/1/2020 | _________   | ________     |   
| AMZN   | 5/1/2020 | 6/1/2020 | _________   | ________     |
| EPD    | 1/1/2020 | 3/1/2020 | _________   | ________     |   
| EPD    | 1/1/2020 | 4/1/2020 | _________   | ________     |
|-----------------------------------------------------------|

My question is whether there is a way to use R to pull into Table 2 the closing price data from Table 1 for each Date1 and Date2 in each row of Table 2. For instance, in the first row of Table 2, ideally the R code would pull in 180 for PriceOnDate1 and 176 for PriceOnDate2.

I've tried searching for answers, but I am unable to craft a solution that would allow me to do this in R Studio. Can anyone please help me with a solution? I greatly appreciate your time. THANK YOU!!

Upvotes: 3

Views: 1061

Answers (3)

eipi10
eipi10

Reputation: 93821

This can be done with a single join if both data frames are in long format, followed by a pivot_wider to get the desired final shape.

The code below uses @Adam's sample data. Note that in the sample data, the dates are coded as factors. You'll probably want your dates coded as R's Date class in your real data.

library(tidyverse)

table_2 %>% 
  pivot_longer(-Ticker, values_to="Date") %>% 
  left_join(
    table_1 %>% 
      pivot_longer(-Date, names_to="Ticker", values_to="Price")
  ) %>% 
  pivot_wider(names_from=name, values_from=c(Date, Price)) %>% 
  rename_all(~gsub("Date_", "", .))
  Ticker Date1    Date2    Price_Date1 Price_Date2
1 MSFT   1/1/2020 4/1/2020         180         176
2 AMZN   5/1/2020 6/1/2020        2200        2600

Upvotes: 0

user10917479
user10917479

Reputation:

Working in something like R requires you to think of the data a bit differently. Your Table 1 is probably easiest to work with pivoted into a long format. You can then just join together on the Ticker and Date to pull the values you want.

Data:

table_1 <- data.frame(Date = c("6/1/2020", "5/1/2020", "4/1/2020", "3/1/2020", 
                               "2/1/2020", "1/1/2020"),
                      MSFT = c(196, 186, 176, 166, 170, 180),
                      AMZN = c(2600, 2200, 2000, 1800, 2200, 2300),
                      EPD = c(19, 20, 15, 14, 18, 17))

# only created part of Table 2
table_2 <- data.frame(Ticker = c("MSFT", "AMZN"),
                      Date1 = c("1/1/2020", "5/1/2020"),
                      Date2 = c("4/1/2020", "6/1/2020"))

Solution:

The tidyverse approach is pretty easy here.

library(dplyr)
library(tidyr)

First, pivot Table 1 to be longer.

table_1_long <- table_1 %>% 
  pivot_longer(-Date, names_to = "Ticker", values_to = "Price")

Then join in the prices that you want by matching the Date and Ticker.

table_2 %>% 
  left_join(table_1_long, by = c(Date1 = "Date", "Ticker")) %>% 
  left_join(table_1_long, by = c(Date2 = "Date", "Ticker")) %>% 
  rename(PriceOnDate1 = Price.x,
         PriceOnDate2 = Price.y)

#   Ticker    Date1    Date2 PriceOnDate1 PriceOnDate2
# 1   MSFT 1/1/2020 4/1/2020          180          176
# 2   AMZN 5/1/2020 6/1/2020         2200         2600

Upvotes: 0

Daniel V
Daniel V

Reputation: 1386

The mapply function would do it here:

Let's say your first table is stored in a data.frame called df and the second in a data.frame called df2

df2$PriceOnDate1 <- mapply(function(ticker, date){temp[[ticker]][df$Date == date]}, df2$Ticker, df2$Date1)
df2$PriceOnDate2 <- mapply(function(ticker, date){temp[[ticker]][df$Date == date]}, df2$Ticker, df2$Date2)

In this code, the Hlookup is the double brackets ([[), which returns the column with that name. The VLOOKUP is the single brackets ([) which returns the value in a certain position.

Upvotes: 0

Related Questions