Reputation: 7147
I have a small data set and I am trying to subset the data.frame using the grepl function.
I have;
year_list <- list("2013", "2014", "2015", "2016", "2017")
test.2013 <- subset(searches[, 1:2], grepl(year_list[1], searches$date))
test.2014 <- subset(searches[, 1:2], grepl(year_list[2], searches$date))
test.2015 <- subset(searches[, 1:2], grepl(year_list[3], searches$date))
test.2016 <- subset(searches[, 1:2], grepl(year_list[4], searches$date))
test.2017 <- subset(searches[, 1:2], grepl(year_list[5], searches$date))
I am trying to create a loop in order to subset columns 1 to 2 (the date
column and hits
column) into a new data.frame
.
I am trying to take the date
in year_lists
, apply the grepl function to the column date
in the searches data.frame
and return these values into a new data.frame
but using a loop function or something less repetitive than what I currently have.
Dataframe
date hits keyword geo gprop category
1: 2013-01-06 23 Price world web 0
2: 2013-01-13 23 Price world web 0
3: 2013-01-20 40 Price world web 0
4: 2013-01-27 25 Price world web 0
5: 2013-02-03 21 Price world web 0
6: 2013-02-10 19 Price world web 0
Upvotes: 0
Views: 62
Reputation: 2206
If my understanding is correct that you want to split a data.frame
into several data.frames
s on basis of the entries in the date column, then you might consider the following solution which produces a list of the desired data.frame
subsets using split
. I have used your data (not as data.table
) and introduced two lines representing an additional year. I hope my understanding was correct.
df <- read.table(text = "
date hits keyword geo gprop category
2013-01-06 23 Price world web 0
2013-01-13 23 Price world web 0
2013-01-20 40 Price world web 0
2013-01-27 25 Price world web 0
2013-02-03 21 Price world web 0
2013-02-10 19 Price world web 0
2014-02-03 21 Price world web 0
2014-02-10 19 Price world web 0
", header = T, stringsAsFactors = F)
#extract only the four first digits from date column
#to generate splitting groups
df_split <- split(df[, c("date", "hits")], gsub("(\\d{4})(.*$)", "\\1", df$date))
df_split
# $`2013`
# date hits
# 1 2013-01-06 23
# 2 2013-01-13 23
# 3 2013-01-20 40
# 4 2013-01-27 25
# 5 2013-02-03 21
# 6 2013-02-10 19
#
# $`2014`
# date hits
# 7 2014-02-03 21
# 8 2014-02-10 19
Upvotes: 1