Ginko-Mitten
Ginko-Mitten

Reputation: 390

Unable to calculate mean for each day in ggplot facet?

My dataset

I have a dataset speedtest.csv that looks something like this:


Date,Time,Download,Upload
2023/10/01,00:00:00,34957192.9969772,20046840.2637393
2023/10/01,00:20:00,35826556.3982541,36143231.5378943
2023/10/01,00:40:00,27695436.936076,4957720.87281617
...
...
2023/10/02,01:00:00,22575345.5295727,10335897.5917135
2023/10/02,01:20:00,15805169.0654657,6179704.32589804
2023/10/02,01:40:00,31638270.9069979,15951432.6154521
...
...
2023/10/03,05:00:00,31366450.4288069,4476811.81028971
2023/10/03,05:20:00,10709016.6772629,8848402.0645949
2023/10/03,05:40:00,32722858.0348491,2045099.79491319
...
...

My current code

Given below is the my current code that is able to process and output the data using facets in ggplot. For this exercise, I am trying to generate the ggplot within a single code-block using pipes.

library(tidyverse)

df<-read.csv("speedtest.csv", header = TRUE)

df%>%
  select(Date,Time, Upload, Download)%>%
  pivot_longer(cols = c("Upload", "Download"),
               names_to = "Parameter", values_to = "Value")%>%
  mutate(Date = as.POSIXct(Date, format = "%Y/%m/%d"),
         Time = as.POSIXct(Time, format = "%H:%M:%S"))%>%
  ggplot(aes(x=Time, y=(Value/8000000), colour=Parameter))+
  geom_line()+
  geom_point()+
  facet_wrap(~Date, ncol = 3, nrow = 2)+
  scale_colour_manual(values = c("blue","orange"))+
  scale_x_datetime(date_breaks = "4 hours",
                   date_labels = "%H:%M")+
  xlab("Time")+
  ylab("Speed (Mbyte/s)")+
  ylim(c(0,5))+
  theme_bw()+
  theme(strip.background = element_blank(),
        legend.position = "bottom",
        panel.grid = element_line(linetype = "dashed"))

ggsave("Speedtest_Figure.png")


Current output

That produces an output that looks something like this:

Current_output

Requested change

Is it possible to add the text in each facet like:

Mean Download: NNN
Mean Upload: MMM

Where NNN and MMM are the mean download and upload speed for each day that will change for each facet. I want to do this within the single block of code, within the pipes if possible.

My attempt

I have gone for the non-optimal method of creating different objects and then using geom_text to input the information in the plot.

I created a separate object Mean.dat

Mean.dat<-df%>%
  select(Date,Time, Upload, Download)%>%
  mutate(Date = as.POSIXct(Date, format = "%Y/%m/%d"),
         Time = as.POSIXct(Time, format = "%H:%M:%S"))%>%
  group_by(Date)%>%
  summarise(mean(Download)/8000000)

and then added this bit to the main code:

  geom_text(data=Mean.dat,
            aes(x=14, y=5, label=`mean(Download)/8e+06`), 
            colour="black", inherit.aes=FALSE, hjust = -1)

But I get the error message:

Error: Invalid input: time_trans works with objects of class POSIXct only

Other resources

I have referred to the following sources:

Add text to a faceted plot in ggplot2 with dates on X axis

Creating a facet_wrap plot with ggplot2 with different annotations in each plot

Annotate ggplot2 facets with number of observations per facet

How to add annotation on each facet

add unique text to each facet ggplot

I have tried to modify the code in previous attempts, but the date-time problem seems to be persistent.

Addendum

To make the code more reproducible, I was asked to provide a snippet of my data using dput().

Here is the output for dput(df[c(1:5, 10:15, 20:25, 30:35, 40:41),])

structure(list(Date = c("2023/11/04", "2023/11/04", "2023/11/04", 
"2023/11/04", "2023/11/04", "2023/11/04", "2023/11/04", "2023/11/04", 
"2023/11/04", "2023/11/04", "2023/11/04", "2023/11/04", "2023/11/04", 
"2023/11/04", "2023/11/04", "2023/11/05", "2023/11/05", "2023/11/05", 
"2023/11/05", "2023/11/05", "2023/11/05", "2023/11/05", "2023/11/05", 
"2023/11/05", "2023/11/05"), Time = c("13:05:32", "13:20:03", 
"13:40:05", "14:20:03", "14:40:03", "17:20:03", "17:40:02", "18:20:03", 
"18:40:03", "19:20:04", "19:40:03", "22:20:03", "22:40:03", "23:20:03", 
"23:40:03", "00:20:03", "00:40:03", "03:20:03", "03:40:03", "04:20:03", 
"04:40:03", "05:20:03", "05:40:03", "08:20:03", "08:40:03"), 
    Download = c(34957192.9969772, 35826556.3982541, 27695436.936076, 
    32785866.6580349, 34373754.7935802, 29644745.5493678, 31936459.8397868, 
    32782764.8827361, 31366450.4288069, 10709016.6772629, 32722858.0348491, 
    34821984.5787153, 28379214.5120736, 26820887.0474698, 31839780.4165726, 
    32066886.2373525, 33440458.6440393, 28113353.9035434, 26284377.6573347, 
    29154520.6902359, 34918254.2123446, 21598680.1274404, 20700752.1868799, 
    34638409.416459, 34572097.5993048), Upload = c(20046840.2637393, 
    36143231.5378943, 4957720.87281616, 15688120.7580889, 35845959.9473685, 
    18072485.390123, 9069468.67273845, 6973860.4270036, 4476811.81028971, 
    8848402.0645949, 2045099.79491319, 23198345.7376053, 31702122.0677866, 
    11711052.7340582, 12556196.1275965, 28941390.4693129, 21543697.8944099, 
    12966120.4632239, 28660937.1396553, 28476185.8084195, 16678584.8862002, 
    29032008.7959507, 17276854.3732636, 36479144.7960276, 37478780.5131303
    )), row.names = c(1L, 2L, 3L, 4L, 5L, 10L, 11L, 12L, 13L, 
14L, 15L, 20L, 21L, 22L, 23L, 24L, 25L, 30L, 31L, 32L, 33L, 34L, 
35L, 40L, 41L), class = "data.frame")

Upvotes: 2

Views: 61

Answers (1)

stefan
stefan

Reputation: 125038

As I already mentioned in my comment, the issue with your second approach is that you have to specify the x axis value for your labels as a datetime too, i.e. do x = as.POSIXct("14:00", format = "%H:%M").

However, as you mentioned that your preferred or optimal way would be to do all the computations in one pipeline instead of using a second dataset below is one approach which uses a stat_summary to add the labels.

library(tidyverse)

df %>%
  select(Date, Time, Upload, Download) %>%
  pivot_longer(
    cols = c("Upload", "Download"),
    names_to = "Parameter", values_to = "Value"
  ) %>%
  mutate(
    Date = as.POSIXct(Date, format = "%Y/%m/%d"),
    Time = as.POSIXct(Time, format = "%H:%M:%S")
  ) %>%
  ggplot(aes(x = Time, y = (Value / 8000000), colour = Parameter)) +
  geom_line() +
  geom_point() +
  stat_summary(
    data = ~filter(.x, Parameter == "Download"),
    geom = "text",
    aes(
      x = as.POSIXct("14:00", format = "%H:%M"),
      y = stage(Value / 8000000, after_stat = 5),
      label = after_stat(round(y, 2))
    ),
    fun = mean, show.legend = FALSE
  ) +
  facet_wrap(~Date, ncol = 3, nrow = 2) +
  scale_colour_manual(values = c("blue", "orange")) +
  scale_x_datetime(
    date_breaks = "4 hours",
    date_labels = "%H:%M"
  ) +
  xlab("Time") +
  ylab("Speed (Mbyte/s)") +
  ylim(c(0, 5)) +
  theme_bw() +
  theme(
    strip.background = element_blank(),
    legend.position = "bottom",
    panel.grid = element_line(linetype = "dashed")
  )

Upvotes: 2

Related Questions