RL_Pug
RL_Pug

Reputation: 857

How can I wrap lines of code into an function that I can run with one command in R?

So I'm working in a script and doing multiple tasks with the same sets of data. But because each task takes a few hundred lines of code, I end up clearing my global environment so I can move onto the next task. Then I have to rerun the lines of code at the top of the script to import my data again and work on my next task. I want to just type a command that will automatically reimport the data once I'm done with one task and can work on the other.

Here is essentially what I run every time I need to work on the next task. I import my data with the read.csv function and then filter by certain rows I need.

d2015 <- read_csv("Data 2015 CSV.csv")
d2016  <- read_csv("Data 2016 CSV.csv")
d2017 <- read_csv("Data 2017 CSV.csv")
d2018 <- read_csv("Data 2018 CSV.csv")

dta_15 <- d2015 %>% filter(`Number` %in% c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                                                   "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510"))
dta_16 <- d2016 %>% filter(`Number` %in% c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                                                   "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510"))
dta_17 <- d2017 %>% filter(`Number` %in% c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                                                   "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510"))
dta_18 <- d2018 %>% filter(`Number` %in% c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                                                   "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510"))

I tried putting it all in a loop but that didn't work,

rundata <- {
d2015 <- read_csv("Data 2015 CSV.csv")
d2016  <- read_csv("Data 2016 CSV.csv")
d2017 <- read_csv("Data 2017 CSV.csv")
d2018 <- read_csv("Data 2018 CSV.csv")

dta_15 <- d2015 %>% filter(`Number` %in% c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                                                   "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510"))
dta_16 <- d2016 %>% filter(`Number` %in% c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                                                   "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510"))
dta_17 <- d2017 %>% filter(`Number` %in% c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                                                   "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510"))
dta_18 <- d2018 %>% filter(`Number` %in% c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                                                   "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510"))
}

How can i create one command that will rerun all these commands? Best.

Upvotes: 0

Views: 96

Answers (1)

akrun
akrun

Reputation: 887008

We can do this in map

library(dplyr)
library(purrr)
library(readr)

As the values to filter are the same across all the datasets, we can create an object

nm1 <- c("TX-500", "TX-600", "TX-503", "TX-700", "TX-603", 
                           "AZ-502", "MI-501", "LA-503", "GA-500", "FL-510")

Get the files that follow the specific pattern in its names

files <- list.files(pattern = '^Data \\d{4} CSV\\.csv$")

Loop over the files, read with read_csv from readr and filter the elements to create a list of subset of data.frame/tibble. It is better to keep it in a list rather than individual objects in the global env

lst1 <- map(files, ~ read_csv(.x) %>%
                   filter(Number %in% nm1)
    )

Upvotes: 2

Related Questions