Max Fisher
Max Fisher

Reputation: 11

R: Created matrix but cannot create barplott with it

R newbie here created following code to create a dataframe" would like to now like to make a categorical barplot but ggplot wouldnt allow me to do it. Is there anyway to reformat the dataframe for ggplot to work. i attached the picture of the data.frame.

library(ggplot2)
library(dplyr)

#create dataframe
df_conversionrates <- data.frame(matrix(ncol = 7, nrow = 2))
colnames(df_conversionrates) <- (days = c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Sunday"))
rownames(df_conversionrates) <- (category = c("conversionrate_control","conversionrate_treatment"))

#calculate conversion rates
for (g in 1:2)
 {
   for(n in 1:7)
     {
    df_conversionrates[g,n] <- nrow(filter(rocketfuel, test==(g-1) & mode_impr_day==n & converted==1))/nrow(filter(rocketfuel,mode_impr_day==n))*100
   }
}

please see the picture of the dataframe how it looks like, I am trying to add 2 bar plots for each day according to the two categories in the y axis

Upvotes: 1

Views: 23

Answers (1)

Allan Cameron
Allan Cameron

Reputation: 173793

You need your data to be in long format to get ggplot to work on this data. Your data wasn't reproducible (without transcribing your picture across), so the following is an approximation of your data:

df_conversionrates <- 
  data.frame(matrix(c(0.09, 0.05, 0.067, 0.095, 0.067, 0.05, 0.073,
                      3.2, 2.9, 2.4, 2.1, 2.2, 2.1, 2.4), 
                    byrow = TRUE, nrow = 2))

We can make the data into a long format data frame using base R's stack function, and add a factor level to indicate whether we are referring to the treatment or control group:

df <- stack(df_conversionrates)
df$group <- factor(rep(c("control", "treatment"), 7))

That means that df now looks like this:

df
#>    values       ind     group
#> 1   0.090    Monday   control
#> 2   3.200    Monday treatment
#> 3   0.050   Tuesday   control
#> 4   2.900   Tuesday treatment
#> 5   0.067 Wednesday   control
#> 6   2.400 Wednesday treatment
#> 7   0.095  Thursday   control
#> 8   2.100  Thursday treatment
#> 9   0.067    Friday   control
#> 10  2.200    Friday treatment
#> 11  0.050  Saturday   control
#> 12  2.100  Saturday treatment
#> 13  0.073    Sunday   control
#> 14  2.400    Sunday treatment

Now the plotting is straightforward:

ggplot(df, aes(ind, values, fill = group)) + 
  geom_col(position = position_dodge()) +
  labs(x = "Weekday", y = "Value")

enter image description here

Upvotes: 1

Related Questions