Oliver
Oliver

Reputation: 284

How to add count total column to dataframe in R

I have aggregated a dataframe by the number of times we observe a single word appears in a dataset for meetings on a particular date so it looks like this:

date <- c("2012-05-06", "2013-07-09", "2007-01-03")
word_count <- c("17", "2", "390")
df1 <- data.frame(date, word_count)

I also have a separate dataframe with total word counts for every date and then a series of other dates as well. It looks like this:

date <- c("2012-05-06", "2013-07-09", "2007-01-03", "2004-11-03", "1994-12-03")
word_total <- c("17000", "20", "39037", "39558", "58607")
df2 <- data.frame(date, word_count)

I now want to add another column to df1 that incorporates the totals for the dates that are in df2 but excludes data for any dates that are not in df1. I also want to transform the dataframe so that there is another column dividing word_total by word_count.

So the output would look like this:

date <- c("2012-05-06", "2013-07-09", "2007-01-03")
word_count <- c("17", "2", "390")
word_total <- c("17000", "20", "39037")
word_percentage <- c("0.001", "0.1", "0.00999")
df2 <- data.frame(date, word_count, word_total, word_percentage)`

I know how to use transform to get word_percentage once I have word_total loaded in but I have no idea how to add in relevant column data from word_total. I have tried using merge and intersect to no avail. Any ideas?

Thank you in advance for your help!

Upvotes: 1

Views: 162

Answers (1)

akrun
akrun

Reputation: 887891

If the columns are numeric, then just do a merge and then create the column by dividing

transform(merge(df1, df2, by = c('date')),
        word_percentage = round(word_count/word_total, 3))

Or use match

df1$word_percentage <- df1$word_count/df2$word_total[match(df1$date, df2$date)]

data

df1$word_count <- as.integer(as.character(df1$word_count))
df2$word_total <- as.integer(as.character(df2$word_total))

Upvotes: 2

Related Questions