Reputation: 65
I try to create the following plot in R, based on the dataframe bellow. What I find tricky is to set the symbols and colors by gender, and also writing the legend correctly. Could somebody give me a hand, specially regarding the syntax of the code? Many thanks in advance.
plot(students$height, students$shoesize, xlab="Height", ylab="Shoesize")
Dataframe:
> students
height shoesize gender population
1 181 44 male kuopio
2 160 38 female kuopio
3 174 42 female kuopio
4 170 43 male kuopio
5 172 43 male kuopio
6 165 39 female kuopio
7 161 38 female kuopio
8 167 38 female tampere
9 164 39 female tampere
10 166 38 female tampere
11 162 37 female tampere
12 158 36 female tampere
13 175 42 male tampere
14 181 44 male tampere
15 180 43 male tampere
16 177 43 male tampere
17 173 41 male tampere
Upvotes: 1
Views: 1248
Reputation: 4427
I am not good with colors. What color do you think the females are? Everything else should roughly fit:
students <- read.table(header=TRUE, text=" height shoesize gender population
1 181 44 male kuopio
2 160 38 female kuopio
3 174 42 female kuopio
4 170 43 male kuopio
5 172 43 male kuopio
6 165 39 female kuopio
7 161 38 female kuopio
8 167 38 female tampere
9 164 39 female tampere
10 166 38 female tampere
11 162 37 female tampere
12 158 36 female tampere
13 175 42 male tampere
14 181 44 male tampere
15 180 43 male tampere
16 177 43 male tampere
17 173 41 male tampere",
stringsAsFactors = TRUE)
plot(students$height, students$shoesize, xlab="Height", ylab="Shoesize",
col = c("violet", "blue")[as.numeric(students$gender)],
pch = c(4, 1)[as.numeric(students$gender)])
legend("bottomright", col = c("violet", "blue"), pch = c(4,1), legend = c("female", "male"))
Upvotes: 3
Reputation: 39613
You could try a ggplot2
approach:
library(ggplot2)
#Data
df <- structure(list(height = c(181L, 160L, 174L, 170L, 172L, 165L,
161L, 167L, 164L, 166L, 162L, 158L, 175L, 181L, 180L, 177L, 173L
), shoesize = c(44L, 38L, 42L, 43L, 43L, 39L, 38L, 38L, 39L,
38L, 37L, 36L, 42L, 44L, 43L, 43L, 41L), gender = c("male", "female",
"female", "male", "male", "female", "female", "female", "female",
"female", "female", "female", "male", "male", "male", "male",
"male"), population = c("kuopio", "kuopio", "kuopio", "kuopio",
"kuopio", "kuopio", "kuopio", "tampere", "tampere", "tampere",
"tampere", "tampere", "tampere", "tampere", "tampere", "tampere",
"tampere")), class = "data.frame", row.names = c("1", "2", "3",
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15",
"16", "17"))
The code:
ggplot(df,aes(x=height,y=shoesize,shape=gender,color=gender))+
geom_point(size=3)+theme_bw()
Output:
If you want go further, you can modify the shapes and colors with scale_shape_manual()
and scale_color_manual()
.
ggplot(df,aes(x=height,y=shoesize,shape=gender,color=gender))+
geom_point(size=3)+theme_bw()+
scale_color_manual(values = c('male'='blue','female'='purple'))+
scale_shape_manual(values = c('male'=3,'female'=4))
Output:
Upvotes: 1