iprof0214
iprof0214

Reputation: 701

Obtain count of unique combination of columns in R dataframe without eliminating the duplicate columns from the data

I have the following data :

   A    B   C   D     E 
 1 130 288   6   80    57299 
 2 288 130   6   57299 80 
 3 288 130   6   57299 80 
 4 288 130   6   57299 80 
 5 288 130   6   57299 80 
 6 288 130   6   57299 80 
 7 288 130   6   57299 80  
 8 288 130   6   57299 80   
 9 288 130   6   57299 80 
10 130 288   6   80    57299 

I want to obtain count of unique combination of these columns and append a Frequency column to the existing dataframe without eliminating the duplicate rows. Following is what I want

   A    B   C    D     E      Freq
 1 130 288   6   80    57299  2
 2 288 130   6   57299 80     8
 3 288 130   6   57299 80     8
 4 288 130   6   57299 80     8
 5 288 130   6   57299 80     8
 6 288 130   6   57299 80     8
 7 288 130   6   57299 80     8
 8 288 130   6   57299 80     8
 9 288 130   6   57299 80     8
10 130 288   6   80    57299  2

Trying df_0 <- count(df, A,B,C,D,E) %>% ungroup() gives me

    A    B   C    D     E      Freq
 1 130 288   6   80    57299  2
 2 288 130   6   57299 80     8

By eliminating the duplicates.

How do I go about this?

Upvotes: 1

Views: 1715

Answers (2)

BENY
BENY

Reputation: 323376

R dplyr mutate

dat%>%group_by_(.dots=names(dat))%>%dplyr::mutate(Freq=n())

Python transform

df['Freq']=df.groupby(list(df))['A'].transform('count')

Upvotes: 3

Gregor Thomas
Gregor Thomas

Reputation: 146164

It looks like you want add_count:

df_0 <- add_count(df, A,B,C,D,E)

From the help page for ?count:

add_tally() adds a column "n" to a table based on the number of items within each existing group, while add_count() is a shortcut that does the grouping as well. These functions are to tally() and count() as mutate() is to summarise(): they add an additional column rather than collapsing each group.

Upvotes: 4

Related Questions