mateusk
mateusk

Reputation: 25

How to insert a column with less rows in a dataframe in R

For example, I have the following dataframe:

Date           Price
2020-01-01      500
2020-01-02      550
2020-01-03      480
2020-01-04      420
2020-01-05      450
2020-01-06      390
2020-01-07      430
2020-01-08      480
2020-01-09      490
2020-01-10      485

And I want to insert the "Prince2" column in the previous data frame, so that I repeat the line until the next similar date:

Date           Price2
2020-01-01       50
2020-01-04       20
2020-01-06       90
2020-01-07       30
2020-01-10       80

Would be like this:

Date           Price     Price2
2020-01-01      500        50
2020-01-02      550        50
2020-01-03      480        50
2020-01-04      420        20
2020-01-05      450        20
2020-01-06      390        90
2020-01-07      430        30
2020-01-08      480        30
2020-01-09      490        30
2020-01-10      485        80

Can someone help me?

Upvotes: 2

Views: 677

Answers (1)

akrun
akrun

Reputation: 887138

We can use merge with na.locf0 from zoo

library(zoo)
transform(merge(df1, df2, by = 'Date',all.x = TRUE), Price2 = na.locf0(Price2))

Or with data.table

library(data.table)
setDT(df1)[df2, Price2 := Price2, on = .(Date)][, Price2 := na.locf0(Price2)][]

Or with tidyverse

library(dplyr)
library(tidyr)
left_join(df1, df2, by = 'Date') %>%
      fill(Price2)
#         Date Price Price2
#1  2020-01-01   500     50
#2  2020-01-02   550     50
#3  2020-01-03   480     50
#4  2020-01-04   420     20
#5  2020-01-05   450     20
#6  2020-01-06   390     90
#7  2020-01-07   430     30
#8  2020-01-08   480     30
#9  2020-01-09   490     30
#10 2020-01-10   485     80

Or convert to Date class and do a complete

df1$Date <- as.Date(df1$Date)
df2$Date <- as.Date(df2$Date)
df2 %>% 
   complete(Date = seq(min(df1$Date), max(df1$Date), by = '1 day')) %>%
   fill(Price2) %>% 
   left_join(df1)

data

df1 <- structure(list(Date = c("2020-01-01", "2020-01-02", "2020-01-03", 
"2020-01-04", "2020-01-05", "2020-01-06", "2020-01-07", "2020-01-08", 
"2020-01-09", "2020-01-10"), Price = c(500L, 550L, 480L, 420L, 
450L, 390L, 430L, 480L, 490L, 485L)), class = "data.frame", row.names = c(NA, 
-10L))


df2 <- structure(list(Date = c("2020-01-01", "2020-01-04", "2020-01-06", 
"2020-01-07", "2020-01-10"), Price2 = c(50L, 20L, 90L, 30L, 80L
)), class = "data.frame", row.names = c(NA, -5L))

Upvotes: 2

Related Questions