Reputation: 49
I would like to assign a value in a new column (i.e.: country_count) of the amount of times a specific string occurs (in total) in my data frame.
country = c("DE", "FR", "FR", "FR", "NL","DE")
data_frame =data.frame(country)
This would be the resulting data frame.
country = c("DE", "FR", "FR", "FR", "NL","DE")
country_count = c(2, 3, 3, 3, 1,2)
data_frame =data.frame(country,country_count)
I am aware that I can simply run table(data_frame$country)
to get the same result, but I would like to have the values in an additional column because ultimately I want to assign a different value to the strings (in my case countries) below a certain threshold.
Upvotes: 2
Views: 422
Reputation: 887951
We can use
library(data.table)
setDT(data_frame)[, country_count := .N, country]
Or using base R
data_frame$country_count <- with(data_frame, ave(seq_along(country), country, FUN = length))
Upvotes: 2
Reputation: 737
You can subset the table()
result with your vector of country codes, then cast it to a data frame.
country = c("DE", "FR", "FR", "FR", "NL","DE")
as.data.frame(table(country)[country])
# Result
# country Freq
#1 DE 2
#2 FR 3
#3 FR 3
#4 FR 3
#5 NL 1
#6 DE 2
Upvotes: 3
Reputation: 2636
Fairly straightforward option:
dplyr::count(data_frame, country)
Returns:
country n
1 DE 2
2 FR 3
3 NL 1
Upvotes: 2
Reputation: 16998
You could use dplyr
:
library(dplyr)
data_frame %>%
add_count(country, name="country_count")
returns
country country_count
1 DE 2
2 FR 3
3 FR 3
4 FR 3
5 NL 1
6 DE 2
Upvotes: 5