Reputation: 315
I have a dataset containing economic information on individual countries. I would like to retrieve the first and latest date per each variable. Above the dataset
#I have this
country <- c("AT","AT","AT","AT","BE","BE","CY","CY","CY")
date_yq <- c("2001Q1","2001 Q2","2001 Q3","2001 Q4","2003 Q2","2003 Q4","2006 Q1","2006 Q2","2006 Q3")
gdp <- c(NA,NA,1.2,1.3,NA,2.7,3.1,3.2,3.3)
invest <- c(NA,120,140,160,NA,210,NA,310,NA)
df <- data.frame(country,date_yq,gdp,invest)
df$date_yq <- as.yearqtr(date_yq)
View(df)
# I would like to have this
country <- c("AT","BE","CY")
gdp_min_date <- c("2001Q3","2003 Q4","2006 Q1")
gdp_max_date <- c("2001Q4","2003 Q4","2006 Q3")
invest_min_date <- c("2001Q2","2003 Q4","2006 Q2")
invest_max_date <- c("2001 Q4","2003 Q4","2006 Q2")
df_dates <- data.frame(country,gdp_min_date,gdp_max_date,invest_min_date,invest_max_date)
View(df_dates)
Can you suggest something to solve this? I've been looking around but I couldn't find any solution. Thank you.
EDIT: I am not looking to retrieve the minimum value of GDP but the first available date per each country.
Upvotes: 1
Views: 62
Reputation: 6759
library(dplyr)
df_dates <- df %>%
group_by(country) %>%
summarize(
gdp_min_date = min(date_yq[!is.na(gdp)]),
gdp_max_date = max(date_yq[!is.na(gdp)]),
invest_min_date =min(date_yq[!is.na(invest)]),
invest_max_date =max(date_yq[!is.na(invest)])
)
df_dates
Upvotes: 1