user8959427
user8959427

Reputation: 2067

mapping a function over years and using a data frame

I have a data frame which looks something like:

ID  text  year  DATE
G45 txt1 2010  01/01/2010
G45 txt2 2011  01/01/2011
G45 txt3 2012  01/01/2012
B78 txt4 2010  01/01/2010
B78 txt5 2011  01/01/2011
C12 txt6 2013  01/01/2013

I can collect the years from the df above so I will have unique values of:

year_to_process <- c("2010", "2011", "2012", "2013")

I define my function as the following:

1) I filter by year since I am only interested in the cosine calculation from year t to t-1.

2) I put the text into a VCorpus and clean it

3) I compute the cosine matrix

4) I save the cosine matrix to file

text_to_cosine <- function(x){
data <- x %>% filter(year == year_to_process | year == year_to_process - 1)
# some additional text cleaning
cosine_matrix <- cosine(data)
saveRDS(cosine_matrix, file = cosine_matrix.rds)
}

The code Works when I do something like: year_to_process <- "2012" however when I try to map or walk the data I get an error:

walk(text_to_cosine, year_to_process) Error in .x[[i]] : object of type 'closure' is not subsettable

map(text_to_cosine, year_to_process) Error: .x is not a vector (closure)

How can I make it so that I can feed in each of the years?

i.e. 1) Take 2010 data and process

2) Take 2011 and 2010 data and process

3) Take 2012 and 2011 data and process

4) Take 2013 and 2012 data and process

EDIT:

Here is the function

text_to_cosine <- function(x){
  data <- x %>% filter(filing_date_year == year_to_process | filing_date_year == year_to_process - 1)
…
}

I am now thinking the error is coming from the filter part. Could it be that the first year (say 2005) doesn't have a t-1 and this is causing the error?

EDIT2:

When I apply the following I get the below error:

text_to_cosine <- function(x, year_to_process){
  data <- x %>% filter(filing_date_year == year_to_process | filing_date_year == year_to_process - 1)
…
}

text_to_cosine(df, year_to_process) Error in gzfile(file, mode) : invalid 'description' argument In addition: Warning message: In if (file == "") stop("'file' must be non-empty string") : Show Traceback Rerun with Debug Error in gzfile(file, mode) : invalid 'description' argument

Upvotes: 0

Views: 89

Answers (1)

David Z
David Z

Reputation: 7041

You seem have mixed x and the function's input:

text_to_cosine <- function(year_to_process){
    data <- x %>% filter(year == year_to_process | year == year_to_process - 1)
    # some additional text cleaning
    cosine_matrix <- cosine(data)
    saveRDS(cosine_matrix, file = cosine_matrix.rds)
    }

Upvotes: 1

Related Questions