Tarak
Tarak

Reputation: 1075

R using dcast,melt and concatenation to reshape data frame

I have a data frame as follows:

mydf <- data.frame(Term = c('dog','cat','lion','tiger','pigeon','vulture'), Category = c('pet','pet','wild','wild','pet','wild'),
    Count = c(12,14,19,7,11,10), Rate = c(0.4,0.7,0.3,0.6,0.1,0.8), Brand = c('GS','GS','MN','MN','PG','MN')    ) 

Resulting in data frame:

     Term Category Count Rate Brand
1     dog      pet    12  0.4    GS
2     cat      pet    14  0.7    GS
3    lion     wild    19  0.3    MN
4   tiger     wild     7  0.6    MN
5  pigeon      pet    11  0.1    PG
6 vulture     wild    10  0.8    MN

I wish to transform this data frame into following resultDF

Category         pet              wild              
Term             dog,cat,pigeon   lion,tiger,vulture
Countlessthan13  dog,pigeon       tiger,vulture     
Ratemorethan0.5  cat              tiger,vulture     
Brand            GS,PG            MN                

The row headings indicate the operations like Countlessthan13 means that Count < 13 is applied to the terms and then grouped. Also note that brand name is unique and not reapeated.

I have tried dcast and melt...but not getting desired results.

Upvotes: 3

Views: 339

Answers (1)

akrun
akrun

Reputation: 887158

We can do this using data.table. Convert the 'data.frame' to 'data.table' (setDT(mydf)), grouped by 'Category', create some summarise columns by pasteing the unique values of 'Term' where 'Count' is less than 13 or 'Rate' greater than 0.5, along with pasteing the unique elements of 'Brand'.

library(data.table)
dt <- setDT(mydf)[, .(Term = paste(unique(Term), collapse=","),
                      Countlesstthan13 =  paste(unique(Term[Count < 13]), collapse=","),

                      Ratemorethan0.5 = paste(unique(Term[Rate > 0.5]), collapse=","), 
                      Brand = paste(unique(Brand), collapse=",")), by = Category]

From the summarised dataset ('dt'), we melt to 'long' format by specifying the 'id.var' as 'Category', then dcast it back to 'wide' format.

dcast(melt(dt, id.var = "Category", variable.name = "category"),
                            category ~Category, value.var = "value")
#           category            pet               wild
#1:             Term dog,cat,pigeon lion,tiger,vulture
#2: Countlesstthan13     dog,pigeon      tiger,vulture
#3:  Ratemorethan0.5            cat      tiger,vulture
#4:            Brand          GS,PG                 MN

Upvotes: 3

Related Questions