Atwp67
Atwp67

Reputation: 307

Looping in word count

New to r-studio and I have a two column csv file (response, id) whereby the response column has an assigned id number. Ex., rows 1-250 are assigned id 1, rows 251-311 assigned id 2, etc.

Can I write a loop that accepts the id number and R generates word frequencies based on the passed id number? The output would be to new csv file.

Is this do-able? Any examples would be appreciated.

Upvotes: 1

Views: 382

Answers (2)

nico
nico

Reputation: 51680

Sure it is!

For instance:

# Generate some random data
data <- data.frame(id=rep(1:10, each=200), val=rnorm(2000))
h <- hist(subset(data, id=5)$val, plot=0)
write.csv("output.csv", h$counts)

EDIT How this works:

subset(data, id=5) will get only the rows for which the column called id equals 5.

Now, once we selected only the rows we want (obviously 5 is just an example, you can pass whatever value you want, also in a variable) you get the values which you want to count using the $ operator.

So subset(data, id=5)$val means: take all the rows with id=5 and then consider the column called val.

In my example val is an integer so I use the hist function to get counts (plot=0 is only there to suppress the graphical output). If you have strings you can use the table function instead.

Finally, write.csv outputs the result to a csv file. See ?write.csv or ?write.table for extensive help on the (many) options of these functions.

Upvotes: 1

lawyeR
lawyeR

Reputation: 7664

I may be mis-reading the original question, but OP asks for word counts that correspond to the groups of assigned id numbers.

If so, would not dplyr and a regex to count words address the need? Something like:

new.df <- data %.%  # start with the two-column data frame of id and word strings
  group_by(id) %.%  # aggregate the ids, e.g. id1, id2
  summarise(WordCount = gsub("^.*\\s", "", dataResponse) # count all the words in the 2nd column.  There are multiple ways to count words. 

Upvotes: 1

Related Questions