Reputation: 23
I have data on mergers for 20 years for various firms. I have used a "for" loop in R to separate data for each year which gives me 20 data frames in the global environment. Each data frame is identified by its year: Merger2000 to Merger2019 for 20 years. Now I want to write another for loop to find the unique companies in each data frame (that is, unique firms in each year). Each company is identified by a unique company code (co_code). I know how to do this for each year separately. For example, for the year 2000, I would do something like:
uniquemerger2000 <- Merger2000 %>% distinct(co_code, .keep_all = TRUE)
How do I run a for loop to enable this operation for all years (that is from 2000-2019)? There is some indexing required in the code but I am not sure how to operationalise this in a loop.
Any help would be appreciated. Thanks!
Upvotes: 0
Views: 39
Reputation: 389265
Usually it is better to keep data in one dataframe or a list instead of multiple such objects in global environment.
You can create one list object (list_data
) bringing all the dataframes together and use lapply
/map
to keep unique rows from each dataframe.
library(dplyr)
library(purrr)
list_data <- mget(paste0('Merger', 2000:2019))
result <- map(list_data, ~.x %>% distinct(co_code, .keep_all = TRUE))
Or in base R :
result <- lapply(list_data, function(x) x[!duplicated(x$co_code), ])
Upvotes: 1