Reputation: 49
I have a dataframe as below:
A B C D E F
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1290. 3916. 4514. 5498. 5028 1987
2 798. 3777 5598 5428. 6160. 4668.
3 1212. 3594 6315 6740 6560. 6490.
4 1224 5592. 6203 6230 6304. NA
5 996 2491 3938. 4972 5062 4308.
6 524 3466. 4658. 5044. 4981 4295
I want to make a bar plot with error bars where A, B, C, D, E, F are x values and the corresponding column values are averaged and used as the y values. Also, I have some NA cells in my dataset but I'd like to ignore it when taking average e.g. with na.rm(), instead of removing whole columns or rows. Could you guide me in the right direction? Thanks!
Upvotes: 1
Views: 606
Reputation: 16178
You need to reshape your dataframe into a longer format for example using pivot_longer
function of tidyr
package and calculate the mean of each group.
For example using dplyr
and tidyr
package, you can do:
library(ggplot2)
library(dplyr)
library(tidyr)
df %>% pivot_longer(everything(), names_to = "X", values_to = "Y") %>%
group_by(X) %>%
summarise(Mean = mean(Y, na.rm = TRUE)) %>%
ggplot(aes(x = X, y = Mean))+
geom_col()
Here, an application of the code below using the following dummy example mimicking your data:
df <- data.frame(A = sample(1000:9999,6),
B = sample(1000:9999,6),
C = sample(1000:9999,6),
D = sample(1000:9999,6))
df[4,4]<- NA
A B C D
1 1499 6992 1866 5793
2 5479 2596 4945 2399
3 7193 1043 2623 2007
4 9464 7624 6758 NA
5 6716 2270 4119 1600
6 5563 4771 8427 7973
If you apply the code below, you can get:
Does it answer your question ?
Upvotes: 2