user113156
user113156

Reputation: 7147

Convert the following into a loop function

I have a small data set and I am trying to subset the data.frame using the grepl function.

I have;

year_list <- list("2013", "2014", "2015", "2016", "2017")

test.2013 <- subset(searches[, 1:2], grepl(year_list[1], searches$date))
test.2014 <- subset(searches[, 1:2], grepl(year_list[2], searches$date))
test.2015 <- subset(searches[, 1:2], grepl(year_list[3], searches$date))
test.2016 <- subset(searches[, 1:2], grepl(year_list[4], searches$date))
test.2017 <- subset(searches[, 1:2], grepl(year_list[5], searches$date))

I am trying to create a loop in order to subset columns 1 to 2 (the date column and hits column) into a new data.frame.

I am trying to take the date in year_lists, apply the grepl function to the column date in the searches data.frame and return these values into a new data.frame but using a loop function or something less repetitive than what I currently have.

Dataframe

         date hits         keyword   geo gprop category
1: 2013-01-06   23  Price world   web        0
2: 2013-01-13   23  Price world   web        0
3: 2013-01-20   40  Price world   web        0
4: 2013-01-27   25  Price world   web        0
5: 2013-02-03   21  Price world   web        0
6: 2013-02-10   19  Price world   web        0

Upvotes: 0

Views: 62

Answers (1)

Manuel Bickel
Manuel Bickel

Reputation: 2206

If my understanding is correct that you want to split a data.frame into several data.framess on basis of the entries in the date column, then you might consider the following solution which produces a list of the desired data.frame subsets using split. I have used your data (not as data.table) and introduced two lines representing an additional year. I hope my understanding was correct.

df <- read.table(text = "
date hits         keyword   geo gprop category
2013-01-06   23  Price world   web        0
2013-01-13   23  Price world   web        0
2013-01-20   40  Price world   web        0
2013-01-27   25  Price world   web        0
2013-02-03   21  Price world   web        0
2013-02-10   19  Price world   web        0
2014-02-03   21  Price world   web        0
2014-02-10   19  Price world   web        0
", header = T, stringsAsFactors = F)

#extract only the four first digits from date column
#to generate splitting groups
df_split <- split(df[, c("date", "hits")], gsub("(\\d{4})(.*$)", "\\1", df$date))

df_split
# $`2013`
#       date    hits
# 1 2013-01-06   23
# 2 2013-01-13   23
# 3 2013-01-20   40
# 4 2013-01-27   25
# 5 2013-02-03   21
# 6 2013-02-10   19
# 
# $`2014`
#       date    hits
# 7 2014-02-03   21
# 8 2014-02-10   19

Upvotes: 1

Related Questions