Leah1
Leah1

Reputation: 27

how to format data to box-plot

I am trying to boxplot this dataset and it is the only one of many who are very similar, which does not boxplot. I set my data a <- as.numeric. The error message I then get is: Error in x[!xna] : object of type 'builtin' is not subsettable In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'builtin' 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'builtin'.

The data looks like this when I type in

summary(a) 
#> Frontal        L45         R45        L22.5   
#> 40.0   : 2   0.0    :4   0.0    :9   na     : 4  
#> 90.0   : 2   15.0   :3   na     :3   0.0    : 2  
#> 0.0    : 1   na     :3   11.5   :1   13.2   : 1  
#> 10.0   : 1   1.7    :1   13.4   :1   14.5   : 1  
#> 10.2   : 1   15.9   :1   15.0   :1   15.0   : 1  
#> 15.0   : 1   16.5   :1   17.3   :1   15.3   : 1  
#> (Other):12   (Other):7   (Other):4   (Other):10  
#>
#> R22.5   
#> 0.0    : 4  
#> 90.0   : 2  
#> 11.7   : 1  
#> 15.0   : 1  
#> 16.0   : 1  
#> 18.9   : 1  
#> (Other):10  

and like this in tabulated version

Frontal
15
58.2
3.8
9.2
23.9
0
na
22.1
46.6
5.3
40
10
32
32.5
90
89.2
72.6
40
10.2
90

L45
17
15
8.7
1.7
5
3
na
3.3
16.5
15.9
0
0
na
15
15
3.1
7.4
0
na
0

R45
11.5
23.7
0
0
0
0
na
0
0
25.5
0
0
0
2.4
15
13.4
17.3
na
5.2
na

L22.5
19.9
15.3
0
45.1
0
20.8
na
14.5
24.4
15
na
74.4
29.3
6.8
8.8
na
13.2
na
40
19.5

R22.5
40
90
57.1
11.7
2.9
0
0
na
0
36.9
80.2
15
0
90
30.3
47.7
57.6
18.9
16
24.3

Any troubleshooting would be greatly appreciated. Best, Leah

Upvotes: 0

Views: 132

Answers (1)

Eric Fail
Eric Fail

Reputation: 7928

to help you ask better questions in the future. Please promise us to read the article linked by jogo above. It's more likely that we will be able to help you if you provide a complete minimal reproducible example to go along with your question. Something we can work from and use to show you how it might be possible to answer your question.

Here is some data I've produced, but it is most likely not identical to your situation, as I had to guess in regard to its structure,

a = data.frame(Frontal = c(4L, 13L, 7L, 16L, 6L, 1L, 18L, 5L, 11L, 12L, 10L, 2L, 8L, 
                9L, 17L, 15L, 14L, 10L, 3L, 17L), L45 = c(6L, 3L, 12L, 2L, 10L, 7L, 13L, 
                9L, 5L, 4L, 1L, 1L, 13L, 3L, 3L, 8L, 11L, 1L, 13L, 1L), R45 = c(2L, 7L, 
                1L, 1L, 1L, 1L, 10L, 1L, 1L, 8L, 1L, 1L, 1L, 6L, 4L, 3L, 5L, 10L, 9L, 
                10L), L22.5 = c(7L, 5L, 1L, 12L, 1L, 8L, 16L, 3L, 9L, 4L, 16L, 14L, 10L, 
                13L, 15L, 16L, 2L, 16L, 11L, 6L), R22.5 = c(10L, 15L, 12L, 2L, 6L, 1L, 1L, 
                16L, 1L, 9L, 14L, 3L, 1L, 15L, 8L, 11L, 13L, 5L, 4L, 7L))

# install.packages(c("tidyverse"), dependencies = TRUE)
library(tidyverse)

I suspect your data is stored as factors, despite your as.numeric() call. Take a look at this output of summary() with data as.factor,

a %>% mutate_all(as.factor) %>% summary() 
#>     Frontal        L45         R45        L22.5        R22.5   
#>  10     : 2   1      :4   1      :9   16     : 4   1      : 4  
#>  17     : 2   3      :3   10     :3   1      : 2   15     : 2  
#>  1      : 1   13     :3   2      :1   2      : 1   2      : 1  
#>  2      : 1   2      :1   3      :1   3      : 1   3      : 1  
#>  3      : 1   4      :1   4      :1   4      : 1   4      : 1  
#>  4      : 1   5      :1   5      :1   5      : 1   5      : 1  
#>  (Other):12   (Other):7   (Other):4   (Other):10   (Other):10  

you can compare that to how it looks when using summary() on my data (that I know is numeric),

a %>%  summary() 
#>     Frontal           L45             R45            L22.5           R22.5      
#>  Min.   : 1.00   Min.   : 1.00   Min.   : 1.00   Min.   : 1.00   Min.   : 1.00  
#>  1st Qu.: 5.75   1st Qu.: 2.75   1st Qu.: 1.00   1st Qu.: 4.75   1st Qu.: 2.75  
#>  Median :10.00   Median : 5.50   Median : 2.50   Median : 9.50   Median : 7.50  
#>  Mean   : 9.90   Mean   : 6.30   Mean   : 4.15   Mean   : 9.25   Mean   : 7.70  
#>  3rd Qu.:14.25   3rd Qu.:10.25   3rd Qu.: 7.25   3rd Qu.:14.25   3rd Qu.:12.25  
#>  Max.   :18.00   Max.   :13.00   Max.   :10.00   Max.   :16.00   Max.   :16.00  

if you want to give people a glimpse of your data, you can do something like this,

a %>% as_tibble() %>% print(n = 7)
#> # A tibble: 20 x 5
#>   Frontal   L45   R45 L22.5 R22.5
#>     <int> <int> <int> <int> <int>
#> 1       4     6     2     7    10
#> 2      13     3     7     5    15
#> 3       7    12     1     1    12
#> 4      16     2     1    12     2
#> 5       6    10     1     1     6
#> 6       1     7     1     8     1
#> 7      18    13    10    16     1
#> # ... with 13 more rows

the above output also show how the individual vectors in a are stored. Here they are all stored as integers, int. You can also use the actual glimpse() from the packages,

a %>% as_tibble() %>% glimpse()
#> Observations: 20
#> Variables: 5
#> $ Frontal <int> 4, 13, 7, 16, 6, 1, 18, 5, 11, 12, ... 
#> $ L45     <int> 6, 3, 12, 2, 10, 7, 13, 9, 5, 4, 1, ...
#> $ R45     <int> 2, 7, 1, 1, 1, 1, 10, 1, 1, 8, 1, 1...
#> $ L22.5   <int> 7, 5, 1, 12, 1, 8, 16, 3, 9, 4, 16, ...
#> $ R22.5   <int> 10, 15, 12, 2, 6, 1, 1, 16, 1, 9, ...

maybe str() from is actually better here,

str(a)
#> 'data.frame':    20 obs. of  5 variables:
#>  $ Frontal: int  4 13 7 16 6 1 18 5 11 12 ...
#>  $ L45    : int  6 3 12 2 10 7 13 9 5 4 ...
#>  $ R45    : int  2 7 1 1 1 1 10 1 1 8 ...
#>  $ L22.5  : int  7 5 1 12 1 8 16 3 9 4 ...
#>  $ R22.5  : int  10 15 12 2 6 1 1 16 1 9 ...

All three output options show that the vectors in the data (I produced) are integers, int, i.e. they are numeric. You should investigate your datas structure and make sure its numeric. If not you can use a %>% mutate_all(as.numeric) to get there.

Enough of that. Here's one box-plot option on a, though I do not know if this is what you are looking for,

a %>% gather()  %>% ggplot(aes(key, value))  + geom_boxplot()

sigh-sight

wanted to elaborate the gather() call a bit to show how it works,

a %>% gather(key = "Type: L, R, or Frontal", value = "int value") %>% 
        ggplot(aes(`Type: L, R, or Frontal`, `int value`)) + 
        geom_boxplot(fill = "white", colour = "#3366FF") + 
        geom_jitter(width = .2, colour = "#3366FF", alpha = 0.4)

get a life

Upvotes: 2

Related Questions