Reputation: 13
I am trying to add a column to my data (called boston) that states if each entry is above or below the median crime rate (variable called crim) I have found the median by median(boston$crim) but now I need to add a column that states if the crime rate is above or below that number.
Upvotes: 1
Views: 277
Reputation: 78927
Here is an example with the iris
dataset using the dplyr
package.
First select only Sepal.Length
column, then mutate
the median in a new coulmn and then use ifelse
to set above or below.
library(dplyr)
iris %>%
select(Sepal.Length) %>%
mutate(Sepal.Length.median = median(Sepal.Length),
Sepal.Length.above.below = ifelse(Sepal.Length.median > Sepal.Length.median, "above", "below")
) %>%
head()
Output:
Sepal.Length Sepal.Length.median Sepal.Length.above.belos
1 5.1 5.8 below
2 4.9 5.8 below
3 4.7 5.8 below
4 4.6 5.8 below
5 5.0 5.8 below
6 5.4 5.8 below
Upvotes: 1
Reputation: 142
You can use package dplyr
and case_when
clause:
library(dplyr)
boston <- boston %>%
mutate(med_crim = median(crim, na.rm = TRUE)) %>%
mutate(
above_or_below = case_when(
crim > med_crim ~ "above",
crim < med_crim ~ "below",
TRUE ~ "equal"),
##You can also create a variable with the difference to the median:
diff_to_median = crim - med_crim)
Upvotes: 0