tangerine7199
tangerine7199

Reputation: 489

Vertical Line ggplot for x categorical variable (not date)

I have this dataframe that I'm trying to make a vertical line on an x-axis that is categorical.

data <- data.frame(
  condition = c('1', '1', '1', '1', '1', '2', '2', '2', '2', '2', '3', '3', '3', '3', '3'),
  AssessmentGrade = c('400', '410', '420', '430', '440', '500', '510', '520', '530', '540', 
                      '300', '310', '320', '330', '340'), 
  Freq = c('1', '2', '1', '5', '7', '9', '1', '5', '3', '4', '5', '8', '1', '3', '5'), 
  MathGrade = c('A+', 'B-', 'C-', 'D', 'F', 'A-', 'B', 'C+', 'D-', 'F', 'A+', 'D', 'D', 'F', 'C'), 
  Condition = c('Condition 1', 'Condition 1', 'Condition 1', 'Condition 1', 'Condition 1', 
                'Condition 2', 'Condition 2', 'Condition 2', 'Condition 2', 'Condition 2', 
                'Condition 3', 'Condition 3', 'Condition 3', 'Condition 3', 'Condition 3'))

I tried adding a field to make grade numeric and that helped

data$Gradenum <- as.numeric(data$MathGrade)

I used ggplot to get abubble graph but I was wondering how I would edit it to use my company's standard colors

p <- ggplot(data, aes(x = MathGrade, y = AssessmentGrade, size = Freq, fill = Condition)) +
 geom_point(aes(colour = Condition)) +
 ggtitle("Main Title") +
 labs(x = "First Math Grade", y = "Math Assessment Score")

How can I get a vertical line between C+ and D? I see a lot of information out there if your x axis is a date but not for other categorical values

Upvotes: 1

Views: 3395

Answers (3)

Uwe
Uwe

Reputation: 42582

Hardcoded solutions are error-prone

MrSnake's solution works - but only for the given data set because the value of 7.5 is hardcoded.

It will fail with just a minor change to the data, e.g., by replacing grade "A+" in row 1 of data by an "A".

Using the hardcoded xintercept of 7.5

p + geom_vline(xintercept = 7.5)

draws the line between grades C- and C+ instead of C+ and D:

enter image description here

This can be solved using ordered factors. But first note that the chart contains another flaw: The grades on the x-axis are ordered alphabetically

A, A-, A+, B, B-, C, C-, C+, D, D-, F

where I would have expected

A+, A, A-, B, B-, C+, C, C-, D, D-, F

Fixing the x-axis

This can be fixed by turning MathGrade into an ordered factor with levels in a given order:

grades <- c(as.vector(t(outer(LETTERS[1:4], c("+", "", "-"), paste0))), "F")
grades
 [1] "A+" "A"  "A-" "B+" "B"  "B-" "C+" "C"  "C-" "D+" "D"  "D-" "F"
data$MathGrade <- ordered(data$MathGrade, levels = grades)

factor()would be sufficient to plot a properly ordered x-axis but we need an ordered factor for the next step, the correct placement of the vertical line.

Programmatically placing the vertical line

Let's suppose that the vertical line should be drawn between grades C- and D+. However, it may happen that either or both grades are missing from the data. Missing factors won't be plotted. In the sample data set, there are no data with grade D+, so the vertical line should be plotted between grades C- and D.

So, we need to look for the lowest grade equal or greater D+ and the highest grade equal or less than C- in the data set:

upper <- as.character(min(data$MathGrade[data$MathGrade >= "D+"]))
lower <- as.character(max(data$MathGrade[data$MathGrade <= "C-"]))

These are the grades in the actual data set where the vertical line is to be plotted between:

xintercpt <- mean(which(levels(droplevels(data$MathGrade)) %in% c(lower, upper)))
p + geom_vline(xintercept = xintercpt)

enter image description here

Upvotes: 2

Deena
Deena

Reputation: 6223

For changing the colors as to fit your company scheme, you can add something like:

  + scale_color_manual(values = c('Condition 1' = 'grey20', 
                                'Condition 2' = 'darkred', 
                                'Condition 3' = 'blue'))

Upvotes: 0

abichat
abichat

Reputation: 2426

Just add geom_vline ;)

p + geom_vline(xintercept = 7.5)

enter image description here

Upvotes: 1

Related Questions