kskirpic
kskirpic

Reputation: 155

Median in R needs numeric data

I have a dataset with for flats with rooms as number of rooms, balcony_size as balcony size and would like to check what is the median value for each type or rooms

data_new%>%
  group_by(rooms)%>%
  median(balcony_size, na.rm=TRUE)

this code returns an error

Error in median.default(., balcony_size, na.rm = TRUE) : 
  need numeric data

balcony_size is numeric

data_new$balcony_size
   [1]    NA    NA    NA    NA  3.00  2.00  2.00  5.00    NA    NA    NA  4.00  2.00    NA  3.00    NA    NA
  [18]    NA 10.00 44.00  7.50    NA 62.00 29.00 12.00  8.00    NA    NA  6.00  6.00  8.00    NA    NA    NA
  [35]    NA  5.00  4.00    NA 15.00    NA    NA    NA  8.00    NA    NA    NA    NA  8.00    NA    NA    NA
  [52]  6.00  8.00  5.00 10.00    NA  5.00  1.00    NA  2.00 33.00  4.00    NA  4.00  6.00  5.00 12.00 15.00
> str(data_new$balcony_size)
 num [1:40099] NA NA NA NA 3 2 2 5 NA NA ...

Upvotes: 1

Views: 1139

Answers (1)

akrun
akrun

Reputation: 887048

We can use median in mutate if it is to create a new column

library(dplyr)
data_new%>%
    group_by(rooms)%>%
    mutate(Median = median(balcony_size, na.rm=TRUE))

Or if we need only summarised output

data_new%>%
    group_by(rooms)%>%
    summarise(Median = median(balcony_size, na.rm=TRUE))

Or using base R

aggregate(balcony_size ~ room, data_new, median, na.rm = TRUE, na.action = NULL)

If we directly apply median after the group_by, it is the entire dataset on which it is applied and median works on vector and not data.frame

Upvotes: 3

Related Questions