marblewhite
marblewhite

Reputation: 45

geom_bar removed 3 rows with missing values

I'm trying to create a histogram using ggplot2 in R.

This is the code I'm using:

library(tidyverse)

dat_male$explicit_truncated <- trunc(dat_male$explicit_mean)
means2 <- aggregate(dat_male$IAT_D, by=list(dat_male$explicit_truncated,dat_male$id), mean, na.rm=TRUE)
colnames(means2) <- c("explicit", "id", "IAT_D")
sd2 <- aggregate(dat_male$IAT_D, by=list(dat_male$explicit_truncated,dat_male$id), sd, na.rm=TRUE)
length2 <- aggregate(dat_male$IAT_D, by=list(dat_male$explicit_truncated,dat_male$id), length)
se2 <- sd2$x / sqrt(length$x)
means2$lo <- means2$IAT_D - 1.6*se2
means2$hi <- means2$IAT_D + 1.6*se2

ggplot(data = means2, aes(x = factor(explicit), y = IAT_D, fill = factor(id))) + 
  geom_bar(stat = "identity", position = position_dodge()) + 
  geom_errorbar(aes(ymin=lo,ymax=hi, width=.2), position=position_dodge(0.9), data=means2) +
  xlab("Explicit attitude score") + 
  ylab("D-score") 

For some reason I get the following warning message: Removed 3 rows containing missing values (geom_bar). And I get the following histogram: enter image description here

I really have no clue what is going on.

Please let me know if you need to see anything else of my code, I'm never really sure what to include.

dat_male is a dataset that looks like this (I have only included the variables that I mentioned in this question, as the dataset contains 68 variables):

      id explicit_mean     IAT_D         explicit_truncated
5     1        3.1250  0.366158652                  3
6     1        3.3125  0.373590066                  3
9     1        3.6250  0.208096230                  3
11    1        3.1250  0.661983618                  3
15    1        2.3125  0.348246184                  2
19    1        3.7500  0.562406383                  3
28    1        2.5625 -0.292888526                  2
35    1        4.3750  0.560039531                  4
36    1        3.8125 -0.117455439                  3
37    1        3.1250  0.074375196                  3
46    1        2.5625  0.488265849                  2
47    1        4.2500 -0.131005579                  4
53    1        2.0625  0.193040876                  2
55    1        2.6875  0.875420303                  2
62    1        3.8750  0.579146056                  3
63    1        3.3125  0.666095380                  3
66    1        2.8125  0.115607820                  2
68    1        4.3750  0.259929946                  4
80    1        3.0000  0.502709149                  3

means2 is a dataset I have used to calculate means, and that looks like this:

   explicit id       IAT_D        lo         hi
1        0  0         NaN        NaN        NaN
2        2  0  0.23501191  0.1091807  0.3608431
3        3  0  0.31478389  0.2311406  0.3984272
4        4  0 -0.24296625 -0.3241166 -0.1618159
5        1  1 -0.04010111         NA         NA
6        2  1  0.21939286  0.1109138  0.3278719
7        3  1  0.29097806  0.1973051  0.3846511
8        4  1  0.22965463  0.1209229  0.3383864

Now that I see it front of me, it probably has something to do with the NaN's?

Upvotes: 1

Views: 8248

Answers (1)

DJV
DJV

Reputation: 4863

From your dataset it seems like everything is alright.

The errors that you get are an indication that your data.frame has empty values (i.e. NaN and NA). I actually got two warning messages: Warning messages: 1: Removed 1 rows containing missing values (geom_bar). 2: Removed 2 rows containing missing values (geom_errorbar).

Regarding the plot, because you don't have any zero values under explicit, you don't see it in the graph. Similarly, because you have NAs under lo and hi for one in explicit, you don't get the corresponding error bar.

Dataset:

means2 <- read.table(text = "   explicit id       IAT_D        lo         hi
1        0  0         NaN        NaN        NaN
                     2        2  0  0.23501191  0.1091807  0.3608431
                     3        3  0  0.31478389  0.2311406  0.3984272
                     4        4  0 -0.24296625 -0.3241166 -0.1618159
                     5        1  1 -0.04010111         NA         NA
                     6        2  1  0.21939286  0.1109138  0.3278719
                     7        3  1  0.29097806  0.1973051  0.3846511
                     8        4  1  0.22965463  0.1209229  0.3383864",
                     header = TRUE)

plot:

means2 %>% 
  ggplot(aes(x = factor(explicit), y = IAT_D, fill = factor(id))) + 
  geom_bar(stat = "identity", position = position_dodge()) + 
  geom_errorbar(aes(ymin=lo,ymax=hi, width=.2),
                position=position_dodge(0.9)) +
  xlab("Explicit attitude score") + 
  ylab("D-score")

enter image description here

Upvotes: 1

Related Questions