Nick Bohl
Nick Bohl

Reputation: 125

determine median of double column r

I have the following dataset

> temp6
# A tibble: 120 x 1
      Arithmetic Mean
            <dbl>
 1           0.96
 2           2.09
 3           0.57
 4           0.66
 5           0.92
 6           0.60
 7           0.40
 8           0.42
 9           0.27
10           0.47
# ... with 110 more rows

I very badly need the median of this data column, but obviously when I try

median(temp6, na.rm=TRUE)

I get this error message:

Error in median.default(temp6, na.rm = TRUE) : need numeric data

If I attempt to convert this data to 'numeric', that doesn't work either

as.numeric(temp6, na.rm=TRUE)

or

as.numeric(as.character(temp6)

gives:

Error: (list) object cannot be coerced to type 'double'

and

Warning message:
NAs introduced by coercion 

respectively. I've done enough research to know that neither of these processes will work, but I have not been able to find a workaround of any sort to find the median of these data points. Is there any way to make this happen?

Upvotes: 1

Views: 2383

Answers (2)

akrun
akrun

Reputation: 887048

According to ?median

median(x, na.rm = FALSE, ...)

where

x an object for which a method has been defined, or a numeric vector containing the values whose median is to be computed.

If it is a data.frame, then converting to vector can be done with temp6[,1]. As it is a tibble, we need [[. Suppose, we do the extraction with [

temp6[,1]
# A tibble: 10 x 1
#   `Arithmetic Mean`
#               <dbl>
# 1              0.96
# 2              2.09
# 3              0.57
# 4              0.66
# 5              0.92
# 6              0.60
# 7              0.40
# 8              0.42
# 9              0.27
#10              0.47

It is still a tibble, where as using [[

temp6[[1]]
#[1] 0.96 2.09 0.57 0.66 0.92 0.60 0.40 0.42 0.27 0.47

it is converted to a vector

is.vector(temp6[[1]])
#[1] TRUE

Now, we can get the median

median(temp6[[1]], na.rm = TRUE)
#[1] 0.585

Or use the $

median(temp6$`Arithmetic Mean`, na.rm = TRUE)
#[1] 0.585

data

temp6 <- structure(list(`Arithmetic Mean` = c(0.96, 2.09, 0.57, 0.66, 
 0.92, 0.6, 0.4, 0.42, 0.27, 0.47)), .Names = "Arithmetic Mean", row.names = c("1", 
 "2", "3", "4", "5", "6", "7", "8", "9", "10"), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 3

neilfws
neilfws

Reputation: 33782

dplyr::summarise is another option.

library(dplyr)
temp6 %>% 
  summarise(Median = median(`Arithmetic Mean`, na.rm = TRUE))

Upvotes: 2

Related Questions