Reputation: 5155
I have a data frame 'df' like
user_id value
1 t34 A
2 t34 A
3 t87 A
4 t55 B
5 t55 B
6 t76 B
7 t99 A
if i do
table(df$value)
I get
A B
4 3
but I only want to count once per user, something like
table(df$value WHERE user_id IS unique)
and get
A B
3 2
How can this be done?
Upvotes: 3
Views: 2501
Reputation: 39154
The solution submitted are all perfect. Here I would like to provide an approach using the duplicated
function in base R. duplicated
function can determine which rows are duplicates of other rows in a data frame. It can take a data frame as the input argument and return a logical vector.
# Create the example data frame
df <- data.frame(user_id = c("t34", "t34", "t87", "t55", "t55", "t76", "t99"),
value = c("A", "A", "A", "B", "B", "B", "A"),
stringsAsFactors = FALSE)
# Use the duplicate function
uniq_df <- df[!duplicated(df), ]
# Count number of value in uniq_df
table(uniq_df$value)
A B
3 2
Upvotes: 1
Reputation: 886938
Here is an option using data.table
library(data.table)
unique(setDT(df))[, .N, value]
# value N
#1: A 3
#2: B 2
Or with tidyverse
library(dplyr)
distinct(df) %>%
count(value)
# A tibble: 2 × 2
# value n
# <chr> <int>
#1 A 3
#2 B 2
Upvotes: 3
Reputation: 99321
You can take the unique rows first, then tabulate.
table(unique(df)$value)
#
# A B
# 3 2
If you have other columns, then you can take a column subset first, then tabulate.
with(unique(df[c("user_id", "value")]), table(value))
# value
# A B
# 3 2
Upvotes: 6