Thomas Philips
Thomas Philips

Reputation: 1089

Filtering a dataframe in Base R 4.3 using the native pipe

I download a monthly time series of unemployment rates from the Federal Reserve using alfred

df <- alfred::get_alfred_series("UNRATE")

As unemployment data is later revised after its first release, df contains every single observation, revised and unrevised, of UNRATE along with the date on which the revision was posted.

> head(df)
        date realtime_period UNRATE
1 1948-01-01      1960-03-15    3.5
2 1948-02-01      1960-03-15    3.8
3 1948-03-01      1960-03-15    4.0
4 1948-04-01      1960-03-15    4.0
5 1948-05-01      1960-03-15    3.6
6 1948-06-01      1960-03-15    3.8

I'm looking to filter the dataframe to find the first realtime_period associated with each date, and can do it with dplyr:

df |>
    mutate(Delta = realtime_period - date) |>
    group_by(date) |>
    filter(Delta == min(Delta)) |>
    ungroup()

Question: How do I do this in base R (I'm using R 4.3.3) instead of using dplyr? I'm trying to avoid the tidyverse and stick with base R for consistency as its syntax rarely changes.

Sincerely

Thomas Philips

Upvotes: 1

Views: 66

Answers (2)

Darren Tsai
Darren Tsai

Reputation: 35604

You can replace mutate with transform, and replace grouped filter with subset + ave.

df |>
  transform(Delta = abs(realtime_period - date)) |>
  subset(Delta == ave(Delta, date, FUN = min))

transform and subset are both from {base}. ave is from {stats} that is still an internal package of R.

Upvotes: 2

Nir Graham
Nir Graham

Reputation: 5167

df_new <- df
df_new$Delta <- abs(df_new$realtime_period-df_new$date)

df_new <- split(df_new,~date) |> lapply(\(x){
  subset(x,Delta == min(Delta))
})  |> do.call(rbind,args=_) 

Upvotes: 0

Related Questions