Reputation: 229
user_id date datetime page
217568 6/12/2015 49:23.9 Vodafone | How to get in touch with Vodafone
135437 6/10/2015 43:35.7 My Vodafone – Manage your Vodafone Pay Monthly Account Online – Vodafone
196094 6/13/2015 33:39.4 Check the status of Vodafone’s mobile network in real-time
74197 6/6/2015 52:46.1 undefined
153501 6/5/2015 02:55.5 Device Details
71459 6/4/2015 54:05.5
90906 6/9/2015 35:41.7 Vodafone | Mobile Phones
30886 6/9/2015 15:59.8 Vodafone | Mobile Phones
217568 6/9/2015 10:52.9 Vodafone | Mobile Phones
137324 6/16/2015 40:51.7 Vodafone | How to get in touch with Vodafone
This is top 10 rows of the sample data i have , I need to aggreagte "page" column with respect to both date and user_id(this is a unique identifier ), basically I want to arrange this data as, on a particular (user_ID)
I need all the pages that he visited for a particular date in one row separated by "_" .
I tried using this : tabel <- dt[,.SD[,paste(page, sep=",", collapse="_")], by=date]
dt being my data frame, but this gives me the pages visited for a particular date, but I want at (user_id)
level . How can i achieve this using R?
Resulting table should look something like this .(example)
row.names date pages
217568 2015-06-12 page1,page2
217568 2015-06-13 page3,page5
page1,page2,page3,page5
being pages from column "page"
Upvotes: 2
Views: 187
Reputation: 887108
Using data.table
library(data.table)
setDT(df1)[, list(pages=paste(page, collapse="_")),
list(user_id, date=as.Date(date, '%m/%d/%Y'))]
Or using dplyr
library(dplyr)
df1 %>%
group_by(user_id, date=as.Date(date, '%m/%d/%Y')) %>%
summarise(pages=paste(page, collapse='_'))
Upvotes: 1
Reputation: 2535
You could use the aggregate function from the stats package, try something like this:
aggregate(dt$page, list(dt$user_id, dt$date), FUN=paste, collapse=", ")
Be careful with the dates though, if you store them as POSIXlt the coercion to factor could be problematic, if the dates are stored as POSIXct or string this should be no problem.
Upvotes: 2