Reputation: 393
I have a large data frame where I want to do a scatterplot where only the max/min values are labeled.
some_df <- data.frame(
"Sport" = c(1:5),
"avg_height" = c(178, 142, 200, 135, 182),
"avg_weight" = c(66, 61, 44, 77, 100))
I have tried:
library(dplyr)
library(ggplot2)
some_df %>%
ggplot(aes(avg_weight, avg_height, label = Sport)) +
geom_point(shape = 21) +
geom_text(data = subset(avg_height == max(avg_height)))
But get errors telling me that avg_height
is not found.
I have also tried with the geom_text
geom_text(aes(label = ifelse(avg_height=max(avg_height), as.character(Sport), '')),
hjust=0, vjust=0)
with error for Sport
not found.
So I can either label all or none, but with the large data.frame it will be impossible to read. If I can colour only the max/min values it would be fine too. I have experimented with making a new column and trying to join with new variables like below, but it haven't helped me.
maxw <- some_df %>% summarise_each(Max = max(avg_weight))
maxh <- some_df %>% mutate(summarise(Max = max(avg_height)))
The scatterplot I want is with labels only for the max and min of both avg_heigt and avg_weight.
Upvotes: 0
Views: 5181
Reputation: 42544
If I understand correctly, the data points of the extreme values of both avg_weight
and avg_weight
are supposed to be labeled with the value of Sport
:
library(dplyr)
library(ggplot2)
some_df %>%
ggplot(aes(avg_weight, avg_height, label = Sport)) +
geom_point(shape = 21) +
geom_label(data = some_df %>%
filter(avg_height %in% range(avg_height) | avg_weight %in% range(avg_weight)),
nudge_x = 1)
creates
The OP has asked to label the points with the highest and lowest BMI avg_weight / (avg_height/100)^2
as well:
library(dplyr)
library(ggplot2)
# append BMI column to dataset
some_df <- some_df %>%
mutate(bmi = avg_weight / (avg_height/100)^2)
some_df %>%
ggplot(aes(avg_weight, avg_height, label = Sport)) +
geom_point(shape = 21) +
geom_label(data = some_df %>%
filter(
avg_height %in% range(avg_height) |
avg_weight %in% range(avg_weight) |
bmi %in% range(bmi)
),
nudge_x = 1)
The resulting chart is the same as above.
Upvotes: 4