ranaz
ranaz

Reputation: 97

Count Observation by Groups and ID in R

I am trying to write a code that count observation based on a condition. I do not know whether it is possible. What i want to achieve is to count only one obeservation in the group instead of adding them together.

This is the dataframe:

df <- structure(list(ID = c("P40", "P40", "P40", "P40", "P42", "P42",
                     "P43", "P43", "P43"), Year = c("2013", "2013", "2014", "2015", "2013", "2014", "2014", "2014", "2014"),
              Meeting = c("Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes")),
         class = "data.frame", row.names = c(NA, -9L))



ID  Year Meeting
P40 2013     Yes
P40 2013     Yes
P40 2014     Yes
P40 2015     Yes
P42 2013     Yes
P42 2014     Yes
P43 2014     Yes
P43 2014     Yes
P43 2014     Yes

The result i want to achieve:

ID Year      Count
P40 2013     1
P40 2014     1
P40 2015     1
P42 2013     1
P42 2014     1
P43 2014     1

This the code i have so far, this just count all the observation.

df %>% group_by(ID, Year) %>% summarise(Count = n())

Upvotes: 0

Views: 398

Answers (3)

akrun
akrun

Reputation: 886938

We can do a distinct on the dataset and then use count

library(dplyr)
df %>% 
   distinct %>% 
   count(ID, Year)
# A tibble: 6 x 3
#  ID    Year      n
#  <chr> <chr> <int>
#1 P40   2013      1
#2 P40   2014      1
#3 P40   2015      1
#4 P42   2013      1
#5 P42   2014      1
#6 P43   2014      1

Or using data.table

library(data.table)
unique(setDT(df[1:2]))[, .N, .(ID, Year)]

Or using base R

subset(as.data.frame(table(unique(df[1:2]))), Freq != 0)

Or an option with cbind

cbind(unique(df[1:2]), n = 1)

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 388807

Since you just want to have one observation in each group, wouldn't this be

transform(unique(df), count = 1)

#   ID Year Meeting count
#1 P40 2013     Yes     1
#3 P40 2014     Yes     1
#4 P40 2015     Yes     1
#5 P42 2013     Yes     1
#6 P42 2014     Yes     1
#7 P43 2014     Yes     1

Or if you want to have check only for selected columns

transform(unique(df[1:2]), count = 1)

Upvotes: 0

arg0naut91
arg0naut91

Reputation: 14764

Are you after:

count(df %>% distinct(ID, Year), ID, Year, name = 'Count')

Output:

# A tibble: 6 x 3
  ID    Year  Count
  <chr> <chr> <int>
1 P40   2013      1
2 P40   2014      1
3 P40   2015      1
4 P42   2013      1
5 P42   2014      1
6 P43   2014      1

Upvotes: 4

Related Questions