Reputation: 307
New to r-studio and I have a two column csv file (response, id) whereby the response column has an assigned id number. Ex., rows 1-250 are assigned id 1, rows 251-311 assigned id 2, etc.
Can I write a loop that accepts the id number and R generates word frequencies based on the passed id number? The output would be to new csv file.
Is this do-able? Any examples would be appreciated.
Upvotes: 1
Views: 382
Reputation: 51680
Sure it is!
For instance:
# Generate some random data
data <- data.frame(id=rep(1:10, each=200), val=rnorm(2000))
h <- hist(subset(data, id=5)$val, plot=0)
write.csv("output.csv", h$counts)
EDIT How this works:
subset(data, id=5)
will get only the rows for which the column called id
equals 5.
Now, once we selected only the rows we want (obviously 5 is just an example, you can pass whatever value you want, also in a variable) you get the values which you want to count using the $
operator.
So subset(data, id=5)$val
means: take all the rows with id=5 and then consider the column called val
.
In my example val is an integer so I use the hist
function to get counts (plot=0
is only there to suppress the graphical output). If you have strings you can use the table
function instead.
Finally, write.csv
outputs the result to a csv file. See ?write.csv
or ?write.table
for extensive help on the (many) options of these functions.
Upvotes: 1
Reputation: 7664
I may be mis-reading the original question, but OP asks for word counts that correspond to the groups of assigned id numbers.
If so, would not dplyr
and a regex to count words address the need? Something like:
new.df <- data %.% # start with the two-column data frame of id and word strings
group_by(id) %.% # aggregate the ids, e.g. id1, id2
summarise(WordCount = gsub("^.*\\s", "", dataResponse) # count all the words in the 2nd column. There are multiple ways to count words.
Upvotes: 1