Reputation: 532
I have a data frame like the following:
user_id track_id created_at
1 81496937 cd52b3e5b51da29e5893dba82a418a4b 2014-01-01 05:54:21
2 2205686924 da3110a77b724072b08f231c9d6f7534 2014-01-01 05:54:22
3 132588395 ba84d88c10fb0e42d4754a27ead10546 2014-01-01 05:54:22
4 97675221 33f95122281f76e7134f9cbea3be980f 2014-01-02 05:54:24
5 17945688 b5c42e81e15cd54b9b0ee34711dedf05 2014-01-02 05:54:24
6 452285741 8bd5206b84c968eda0af8bc86d6ab1d1 2014-01-02 05:54:25
I want to create a line chart in R showing the number of user_id across days. I want to know how many user_id are present per day and create a plot of that. How do I do it?
Upvotes: 1
Views: 231
Reputation: 426
First of all, you should know how to process date and time in R. I strongly recommend the lubridate package.
library(lubridate)
t <- ymd_hms("20170621111800")
dt <- floor_date(t, unit='day')
dt
Then you need to learn how to manipulate a data frame in R. I usually use dplyr package because it is quite simple to learn and the code is easy to read.
library(dplyr)
new_df <- df %>%
mutate(dt=floor_date(ymd_hms(created_at, unit='day'))) %>%
group_by(dt) %>%
summarise(user_cnt=n_distinct(user_id))
new_df
At last, you need to learn how to plot a data frame in R. I personally prefer to use ggplot2 to do this task.
library(ggplot2)
p <- ggplot(new_df) + geom_line(aes(x=dt, y=user_cnt))
p
Now you will see a picture showed in the bottom right panel if you use RStudio to run the code. Furthermore, you could use plotly package to change the static image to a dynamic chart!
library(plotly)
ggplotly(p)
Upvotes: 3