Kai7
Kai7

Reputation: 65

How to set symbols and colors to distinguish data in an R plot?

I try to create the following plot in R, based on the dataframe bellow. What I find tricky is to set the symbols and colors by gender, and also writing the legend correctly. Could somebody give me a hand, specially regarding the syntax of the code? Many thanks in advance.

plot(students$height, students$shoesize, xlab="Height", ylab="Shoesize")

enter image description here

Dataframe:

> students
   height shoesize gender population
1     181       44   male     kuopio
2     160       38 female     kuopio
3     174       42 female     kuopio
4     170       43   male     kuopio
5     172       43   male     kuopio
6     165       39 female     kuopio
7     161       38 female     kuopio
8     167       38 female    tampere
9     164       39 female    tampere
10    166       38 female    tampere
11    162       37 female    tampere
12    158       36 female    tampere
13    175       42   male    tampere
14    181       44   male    tampere
15    180       43   male    tampere
16    177       43   male    tampere
17    173       41   male    tampere

Upvotes: 1

Views: 1248

Answers (2)

Bernhard
Bernhard

Reputation: 4427

I am not good with colors. What color do you think the females are? Everything else should roughly fit:

students <- read.table(header=TRUE, text="   height shoesize gender population
1     181       44   male     kuopio
2     160       38 female     kuopio
3     174       42 female     kuopio
4     170       43   male     kuopio
5     172       43   male     kuopio
6     165       39 female     kuopio
7     161       38 female     kuopio
8     167       38 female    tampere
9     164       39 female    tampere
10    166       38 female    tampere
11    162       37 female    tampere
12    158       36 female    tampere
13    175       42   male    tampere
14    181       44   male    tampere
15    180       43   male    tampere
16    177       43   male    tampere
17    173       41   male    tampere",
                       stringsAsFactors = TRUE)


plot(students$height, students$shoesize, xlab="Height", ylab="Shoesize",
     col = c("violet", "blue")[as.numeric(students$gender)], 
     pch = c(4, 1)[as.numeric(students$gender)])
legend("bottomright", col = c("violet", "blue"), pch = c(4,1), legend = c("female", "male"))
     
       

enter image description here

Upvotes: 3

Duck
Duck

Reputation: 39613

You could try a ggplot2 approach:

library(ggplot2)
#Data
df <- structure(list(height = c(181L, 160L, 174L, 170L, 172L, 165L, 
161L, 167L, 164L, 166L, 162L, 158L, 175L, 181L, 180L, 177L, 173L
), shoesize = c(44L, 38L, 42L, 43L, 43L, 39L, 38L, 38L, 39L, 
38L, 37L, 36L, 42L, 44L, 43L, 43L, 41L), gender = c("male", "female", 
"female", "male", "male", "female", "female", "female", "female", 
"female", "female", "female", "male", "male", "male", "male", 
"male"), population = c("kuopio", "kuopio", "kuopio", "kuopio", 
"kuopio", "kuopio", "kuopio", "tampere", "tampere", "tampere", 
"tampere", "tampere", "tampere", "tampere", "tampere", "tampere", 
"tampere")), class = "data.frame", row.names = c("1", "2", "3", 
"4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", 
"16", "17"))

The code:

ggplot(df,aes(x=height,y=shoesize,shape=gender,color=gender))+
  geom_point(size=3)+theme_bw()

Output:

enter image description here

If you want go further, you can modify the shapes and colors with scale_shape_manual() and scale_color_manual().

ggplot(df,aes(x=height,y=shoesize,shape=gender,color=gender))+
  geom_point(size=3)+theme_bw()+
  scale_color_manual(values = c('male'='blue','female'='purple'))+
  scale_shape_manual(values = c('male'=3,'female'=4))

Output:

enter image description here

Upvotes: 1

Related Questions