Reputation: 1
I am currently struggling with a median split in R studio. I wish to create a new column in my data frame which is a median split of another, however, I do not know how this can be accomplished. Any and all help will be appreciated. this is the code I have previously run:
medianpcr <- median(honourswork$PCR.x)
highmedian <- filter(honourswork, PCR.x <= medianpcr)
lowmedian <- filter(honourswork, PCR.x > medianpcr)
Upvotes: 0
Views: 1557
Reputation: 3083
Let's first create some data:
set.seed(123)
honourswork <- data.frame(PCR.x = rnorm(100))
In dplyr, you might do:
library(tidyverse)
honourswork %>% mutate(medianpcr = median(PCR.x)) %>%
mutate(highmedian = ifelse(PCR.x > medianpcr, 1, 0)) -> honourswork
honourswork %>% mutate(medianpcr = median(PCR.x)) %>%
mutate(lowmedian = ifelse(PCR.x <= medianpcr, 1, 0)) -> honourswork
Equivalently in base R:
honourswork$highmedian <- 0
honourswork$highmedian[honourswork$PCR.x > median(honourswork$PCR.x)] <- 1
honourswork$lowmedian <- 0
honourswork$lowmedian[honourswork$PCR.x <= median(honourswork$PCR.x)] <- 1
Upvotes: 0
Reputation: 407
When you post a question on SO, it's always a good idea to include an example dataframe so that the answerer doesn't have to create one themselves.
Onto your question, if I understand you correctly, you can use the mutate()
and case_when()
from the dplyr
package:
# Load the dplyr library
library(dplyr)
# Create an example dataframe
data <- data.frame(
rowID = c(1:20),
value = runif(20, 0, 50)
)
# Use case_when to mutate a new column 'category' with values based on
# the 'value' column
data2 <- data %>%
dplyr::mutate(category =
dplyr::case_when(
value > median(value) ~ "Highmedian",
value < median(value) ~ "Lowmedian",
value == median(value) ~ "Median"
)
)
More about case_when() here.
Hope this helps!
Upvotes: 3