Reputation: 1
I have data collected over multiple days, with timestamps that contain information for when food was eaten. Example dataframes:
head(Day3)
==================================================================
Day3.time Day3.Pellet_Count
1 18:05:30 1
2 18:06:03 2
3 18:06:34 3
4 18:06:40 4
5 18:06:52 5
6 18:07:03 6
head(Day4)
==================================================================
Day4.time Day4.Pellet_Count
1 18:00:21 1
2 18:01:34 2
3 18:02:22 3
4 18:03:35 4
5 18:03:54 5
6 18:05:06 6
Given the variability, the timestamps don't line up and therefore aren't matched. I've done a "full join" with merge from all of the data from two of the days, in the following way:
pellets <- merge(Day3, Day4, by = 'time', all=TRUE)
This results in the following:
head(pellets)
==================================================================
pellets.time pellets.Pellet_Count.x pellets.Pellet_Count.y
1 02:40:18 39 NA
2 18:00:21 NA 1
3 18:01:34 NA 2
4 18:02:22 NA 3
5 18:03:35 NA 4
6 18:03:54 NA 5
I would like to plot the Pellet_Count in one line graph from each of the days, but this is making it very difficult to group the data. My approach thus far has been:
pelletday <- ggplot() + geom_line(data=pellets, aes(x=time, y=Pellet_Count.x)) + geom_line(data=pellets, aes(x=time, y=Pellet_Count.y))
But, I get this error:
geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?
I also would like to be able to merge all days (I oftentimes have up to 9 days) and plot it on the same graph.
I believe my goal is to ultimately get the following dataframe output:
==================================================================
pellets.time Pellet_Count Day
1 02:40:18 39 3
2 18:00:21 1 4
3 18:01:34 2 4
4 18:02:22 3 4
5 18:03:35 4 4
6 18:03:54 5 4
and to use this to graph:
ggplot(pellets, aes(time, Pellet_Count, group=Day)
Any ideas?
Upvotes: 0
Views: 52
Reputation: 344
There's a couple of issues here
Firstly have you tried using rbind() or bind_rows() rather than merge.
This seems like a more natural fit for what you're trying to do. With a merge or some other join, you are effectively trying to bring new information into your data table. Most often you are trying to bring in new columns
But here you are really trying to append days' data together, you're not actually adding a new column.
So this is my attempt at replicating what you're describing above
Day3 <- tibble(
Day3.time = c('18:05:30', '18:06:03', '18:06:34',
'18:06:40', '18:06:52', '18:07:03'),
Day3.Pellet_Count = c(1, 2, 3, 4, 5, 6)) %>%
mutate(day = '3') %>%
rename(time = Day3.time)
Day4 <- tibble(
Day4.time = c('18:00:21', '18:01:34', '18:02:22',
'18:03:35', '18:03:54', '18:05:06'),
Day4.Pellet_Count = c(1, 2, 3, 4, 5, 6)) %>%
mutate(day = '4') %>%
rename(time = Day4.time)
pellets <- merge(Day3, Day4, by = 'time', all=TRUE)
time Day3.Pellet_Count day.x Day4.Pellet_Count day.y
1 18:00:21 NA <NA> 1 4
2 18:01:34 NA <NA> 2 4
3 18:02:22 NA <NA> 3 4
4 18:03:35 NA <NA> 4 4
5 18:03:54 NA <NA> 5 4
6 18:05:06 NA <NA> 6 4
7 18:05:30 1 3 NA <NA>
8 18:06:03 2 3 NA <NA>
9 18:06:34 3 3 NA <NA>
10 18:06:40 4 3 NA <NA>
11 18:06:52 5 3 NA <NA>
12 18:07:03 6 3 NA <NA>
And here is how you would work with bind_rows(), (rbind works the same) this should get you more useful data to work with
pettets <- bind_rows(Day3 %>%
+ rename(Pellet_Count = Day3.Pellet_Count),
+ Day4 %>%
+ rename(Pellet_Count = Day4.Pellet_Count))
> pettets
# A tibble: 12 x 3
time Pellet_Count day
<chr> <dbl> <chr>
1 18:05:30 1 3
2 18:06:03 2 3
3 18:06:34 3 3
4 18:06:40 4 3
5 18:06:52 5 3
6 18:07:03 6 3
7 18:00:21 1 4
8 18:01:34 2 4
9 18:02:22 3 4
10 18:03:35 4 4
11 18:03:54 5 4
12 18:05:06 6 4
Secondly you probably need to find a way to handle the dates. So with your Ggplot code a big problem is that you are passing characters where you want to pass date / time data. to get a useful datetime format I think you'll need to have the date.
Upvotes: 1
Reputation: 541
You first need to convert your data from 'wide' to 'long' format (see example here). After this, you should be able to use ggplot (looks like you tried to use base R plot logic here with lines but it doesn't work with ggplot).
For example:
pellets %>% gather("day", "count", -pellets.time) %>% na.omit()
All together it will be:
pellets %>% rename(Day3 = pellets.Pellet_Count.x, Day4 = pellets.Pellet_Count.y) %>% gather("day", "count", -pellets.time) %>% na.omit() %>% ggplot() + geom_point(aes(x=pellets.time, y=count, col=day))
(I added rename to match your preferred output)
Upvotes: 0